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PEPTIDE EXIENDED GLYCOSYLATED POLYPEPTIDES 
FIELD OF THE INVENTION 

5 

The present invention relates to novel glycosylated polypeptides as well as means and methods 
for their preparation, 

BACKGROUND OF TEE INVENTION 

10 

Polypeptides, including proteins, are used for a wide range of applications, including industrial 

uses and human or veterinary therapy. 

One generally recognized drawback associated with polypeptides is that they do not 

have a sufficiently high stability, are immunogenic or allergenic, have a reduced serum half- 
15 life, are susceptible to clearance, are susceptible to proteolytic degradation, and the like. 

One method for improving properties of polypeptides has been to attach non-peptide moieties 

to the polypeptide to improve properties thereof. For mstance, polymer molecules such as PEG 

has been used for reducing immunogenicity and/or increasing serum half -life of therapeutic 

polypeptides and for reducing allergenicity of industrial enzymes. Glycosjdation has been 
20 suggested as another convenient route for improving properties of polypeptides such as 

stability, half-life, etc. 

Machamear and Rose, J. Biol. Chem., 1988, 263, 5948-5954 and 5955-5960, disclose 

modified glycoprotein G of vesicular stomatitis virus that is glycosylated at additional N- 

glycosylation sites introduced in the polypeptide backbone. 
25 US 5,218,092 discloses physiologically active polypeptides with at least one new or 

additional carbohydrate attached thereto. The additional carbohydrate molecule(s) is/aie 

provided by addmg one or more additional N-glycosylation sites to the polypeptide backbone, 

and expressing the polypeptide in a glycosylating host cell. 

US 5,041,376 discloses a method of identifying or shielding epitopes of a transportable 
30 protein, in which method an N-glycosylation site is introduced on the exposed surface of the 

protein backbone (using oUgonucleotide-directed mutagenesis of the nucleotide sequence 

encoding the jarotein), the resulting protein is expressed, glycosylated and assayed for protein 

activity and for shielded epitopes. 
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WO 00726354 discloses a method of reducing the allergenidty of proteins by including 
an additional glycosylation site in the protein backbone and glycosylating the resulting protein 
variant 

Guan et al., Cell, 1985, Vol. 42, 489^96 disclose glycosylated fusion protein variants 
5 comprising a rat growth hormone backbone C-terminally extended with transmembrane and 
cytoplasmic diomains of the vesicular stomatitis virus glycoprotein, which growth hormone 
backbone has been modified to incorporate two additional N-glycosylation sites. 

WO 97/04079 discloses lipolytic enzymes modified to by an N- or C-terminal peptide 
extension capable of conferring improved performance, in particular wash performance to the 
10 enzyme. 

Matsuura et al., Nature Biotechnology, 1999, Vol. 17, 58-61 disclose the use of random 
elongation mutagenesis for improving thermostability of a non-glycosylated microbial catalase. 
The random elongation mutagenesis is conducted in the C-terminal end of the catalase. 

US 5,338,835, entifled CTP extended forms of FSH, describe the use of the C-terminal 
15 portion of the CG beta subunit or a variant thereof for extension of the C-teiminal of CG, FSH 
and LH. Said C-teiminal portion may comprise O-glycosylaticn sites. It is speculated (hat a 
similar approach may be used for other proteins. 

US 5,508,261 discloses alpha, beta-hetesrodimeric polypeptide having binding affinity 
to vertebrate luteinizing honnone (LH) receptors and vertebrate follicle stimulating hoimone 
20 (FSH) receptors comprising a glycoprotein hoimone alpHa-subunit polypeptide and a specified 
non-naturally occurring beta-subunit polypeptide. 

WO 95/05465 discloses EPO analogs which have one or more amino acids extending 
from the C-tenninal end of EPO, the C-terminal extention having at least one additional 
carbohydrate site. The 28 amino acid C-terminal part of CG (having four O-glycosylation sites) 
25 is mentioned as an example. 

WO 97/30161 discloses hybrid proteins comprising two coexpressed amino acid 
sequences forming a dimer, each comprising a) at least one amino acid sequence selected &om 
a homomeric receptor, a chain of a heteromeric receptor, a ligand, and fragments theref ; and b) 
a stibunit of a heterodimeric proteinaceous hormone or fragments thereof; in which a) and b) 
30 are bonded directly or though a peptide linker, and, in each couple, the two subunits (b) are 
different and capable of aggregating to form a dimer complex. 

in none of the above reference it has been disclosed or indicated that a polypeptide of 
interest can be modified to include additional glycosylation sites by N-terminally extending 
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said polypeptide with a peptide sequence comprising one or more additional glycosylation 
sites. The present invention is based on this finding. 

5 BRIEF DESCRIPTrON OF IHEESrVENITON 

Accordingly, in a first aspect the invention relates to a glycosylated polypeptide comprising the 
primary structure, 

10 NHi-X-Pp-COOH 

wherein 

X is a peptide addition comprising or contributing to a glycosylation site, and 

15 Pp is a polypeptide of interest. 

The introduction of additional glycosylation sites by means of a peptide addition is an 
elegant way of providing additional glycosylation sites in a polypeptide of interest. More 
specifically, the invention has the advantage that polypeptides with altered glycosylation 
pattern are more easily obtained, e.g. the variants can be designed without detailed knowledge 

20 or use of structural and/or functional properties of the polypeptide. Also, the utilization of 
glycosylation sites introduced by a peptide addition has been found to be improved relative to 
glycosylation sites introduced within a structural part of the polypeptide Pp. Also other 
properties of the peptide extended polypeptide, such as uptake in specific cells, may be 
uxtproved relative to a polypeptide modified with glycosylation sites in a structiiral part (and 

25 not being subjected to peptide extension). 

In a second aspect the invention relates to a glycosylated polypeptide comprising the 
primary structure NH2-Px-X-PjrC00H, wherein 

Fx is an N-terminal part of a polypeptide Pp of interest, 
30 Py is a C-terminal part of said polypeptide Pp, and 

X is a peptide addition comprising or contributing to a glycosylation site. 

In other aspects the invention relates to a nucleotide sequence encoding a polypeptide 
of the invention, an expression vector comprising said nucleotide sequence and methods of 
preparing a polypeptide of the invention. 
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In a further aspect the invention relates to a method of improving (a) selected 
property^es of a polypeptide Pp of interest, which method comprises a) preparing a nucleotide 
sequence encoding a polypeptide comprising the primary structure 
NHa-X-Pp-COOH. 
5 Wherein 

X is a peptide addition comprising or contributing to a glycosylation site, the peptide addition 
being capable of conferring the selected improved property/ies to the polypeptide Pp, 
. b) expressing the nucleotide sequence of a) in a suitable host cell under conditions ensuring 
attachment of an oligosaccharide moiety thereto, optionally 
10 c) conjugating the expressed polypeptide of b) to a second non-peptide moiety, and 
d) recovering the polypeptide resulting from step c). 

DRAWINGS 

15 Figure 1 is a dosis response curve for uptalie of glucocerebrosidase wildtype and modified 
according to the invention into J774E macrophages. The activity is measured by the GCB 

activity assay. 

Figure 2 illustrates the pharmakokinetics of a FSH polypeptide produced according to the 
20 invention. 

DETAEM) DISCLOSURE OF "nffi INVENTION 
25 DKFJjNmONS 

In the context of the present application and invention the following de&iitions apply: 
The term "conjugate" is used about the oovalent attachment of of one or more 
polypeptide(s) to one or naore non-peptide moieties. The term covalent attachment means that 
30 the polypeptide and the non-peptide moiety are either directly covalently joined to one another, 
or else are indirectly covalently joined to one another through an intervening moiety or 
moieties, such as a bridge, spacer, or Knkage moiety or moieties. 

The term "non-peptide moiety" is intended to indicate a molecule, different from a 
peptide polymer composed of amino acid monomers and hnked together by peptide bonds. 
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which molecule is capable of conjugating to an attachment group of the polypeptide of the 
invention. Preferred examples of such molecule include polymers, e.g. polyalkylene oxide 
moieties lipophilic groups, e.g. fatty acids and ceramides. The term "polymer molecule" is 
defined as a molecule formed by covalent linkage of two or more monomers and may be used 
5 interchangeably with "polymeric group". Except where the number of non-peptide moieties, 
such as polymeric groins, attached to the polypeptide is expressly indicated, every reference to 
"non-peptide moiety " referred to herdn is intended as a reference to one or more non-peptide 
mdeties attached to the polypeptide. 

The term "oligosaccharide moiety" is intended to indicate a carbohydrate-containing molecule 

10 comprising one or more monosaccharide residues, capable of being attached to the polypeptide 
(to produce a glycosylated polypeptide) by way of in vivo or in vitro glycosylation. Except 
where the number of oligosaccharide moieties attached to the polypeptide is expressly 
indicated, every reference to "oligosaccharide moiety" referred to herein is intended as a 
reference to one or more such moieties attached to the polypeptide. 

15 The term "in vivo glycosylation" is intended to mean any attachment of an 

oligosaccharide moiety occurring in vivo, i.e. during posttranslational processing in a 
glycosylating cell used for expression of the polypeptide, e.g. by way of N-linked and O-linked 
glycosylation. Usually, the N-glycosylated oligosaccharide moiety has a common basic core 
structure composed of five monosaccharide residues, namely two N-acetylglucosamine 

20 residues and three mannose residues. The exact ol^osaccharide structure depends, to a large 
extent, on the glycosylating organism in question and on the specific polypeptide. Depending 
on the host cell in question the glycosylation is classified as a high mannose type, a complex 
type or a hybrid type. The term "in vitro glycosylation" is intended to refer to a synthetic 
glycosylation performed in vitro, normally involving covalently linking an oligosaccharide 

25 moiety to an attachment group of a polypeptide, optionally using a cross-linking agent In vivo 
and in vitro glycosylation are discussed in detail further below. 

An 'W-glycosylation site" has the sequence N-X'-S/T/C-X", wherein X' is any amino 
acid residue except proline, X' ' any amino acid residue that may or may not be identical to X' 
and preferably is different from proline, N asparaguie and S/T/C either serine, threonine or 

30 cysteine, preferably serine or threonine, and most preferably tiireonine. The oligosaccharide 
moiety is attached to the N-iesidue of such site. An "0-gl"ycosylation site" is the OH-group of a 
serine or ttueonine residue. An "m vitro glycosylation site" is, e.g., selected from the group 
consisting of the N-terminal amino acid residue of the polypeptide, the C-terminal residue of 
the polypeptide, Ij^ine, cysteine, arginine, glutamine, aspartic acid, glutamic acid, serine, 
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tyrosine, histidine, phenylalanine and tryptophan. Of particular interest is an in vitro 
glycosylation site that is an epsilon-amino group, in particular as part of a lysine residue. 

The term "peptide addition" is intended to indicate one or mor& consecutive amino acid 
residues tiiat are added to the amino acid sequence of the polypeptide Pp of interest. Normally, 
5 the peptide addition is linked to the amino acid sequence of the polypeptide Pp by a peptide 
linkage. 

The tenn "attachment group" is intended to indicate a functional group of the 
polypeptide, in particular of an amino acid residue thereof or an oligosaccharide moiety 
attached to the polypeptide, enable of attaching a non-peptide moiety of interest. Useful 
10 attachment groups and their matching non-peptide moieties are apparent from the table below. 



Attachment 
group 


Amino acid 


Examples of non- 
peptide moiety 


Conjugation 

method/Activate 

dPEG 


Reference 


-NH2 


N-tenninal, 
Lys 


Polymer, e.g. PEG, 
with amide cr imine 
group 

Lipophilic 
substituent 


mPEG-SPA 

Tresylated 

mPEG 


Shearwater Inc. 
Delgado et al, 
critical reviews 
in Hierapeutic 
Drug Carrier 
Systems 
9(3,4):249-304 
(1992) 

WO 97/31022 


-COOH 


C-term, Asp, 
Glu 


Polymer, e.g, PEG, 
with ester or amide 
group 


mPEG-Hz 


Shearwater Inc 


-SH 


Cys 


Polymer, e.g. PEG, 
with disulfide, 
maleimide or vinyl 
sulfone group 


PEG- 

vinylsulphone 
PEG-maleimide 


■Shearwater Ihc 
Delgado et al, 
critical reviews 
in Therapeutic 
Drug Carrier 
Systems 
9(3,4);249-304 
(1992) 


-OH 


Ser, Thr, 
OH-,Lys 


PEG with ester, 
ether, carbamate, 
carbonate 






-CONH2 




Polymer, e.g. PEG 






Alddiyde 
Ketone 


Oxidized 
oligosacchari 


Polymer, e.g. PEG, 


PEG-hydrazide 


Andresz et al., 
1978, 
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de 






Makxomol. 
Chem. 

179:301, WO 
92/16555, WO 
00/23114 



The tenn "comprising an attachment group" is intended to mean that the attachment 
group is present on an amino acid residue of the relevant peptide or polypeptide or on an 
5 oligosaccharide moiety attached to said peptide or polypeptide. 

The term "contributing to a gjycosylation site" as used in connection with the peptide 
addition X is intended to cover the situation, where a giycosylation site is formed fix>m more 
than one amino acid residue (as is the case with an N-glycosyiation site), and where at least one 
such amino acid residue originates fiom the peptide X and at least one amin o acid residue 

10 originates from the polypeptide Pp, whereby the giycosylation site can be considered to bridge 
X and Pp (or, where relevant, P, or Py). 

The term "non-structural part" as used about a part of the polypeptide Pp of interest is 
intended to indicate a part of either the C- or N-terminal end of the folded polypeptide (e.g. 
protein) that is outside the first structural element, such as an a-helix or a p-sheet structure. 

15 The non-:Stractural part can easily be identified in a three-dimensional structure or model of the 
polypeptide. If no structure or model is available, a non-structural part typically comprises or 
consists of the first or last 1-20 (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,. 17, 18, 19 or 
20) amino acid residues, such as 1-10 amino acid residues of the amino acid sequence 
constituting the mature form df the polypeptide of interest. 

20 Amino acid names and atom names (e.g. CA, CB, NZ, N, O, C, etc) are used as defined 

by the Protein DataBank (PDB) fwww.ndb.org^ which are based on the lUPAC nomenclature 
(lUPAC Nomenclature and Symbolism for Amino Acids and Peptides (residue names, atom 
names e.t.c.), Eur. J. Biochenu, 138, 9-37 (1984) together with their corrections in Eur. J. 
Biochem., 152, 1 (1985). The term "amino acid residue" is intended to indicate an amino add 

25 residue contained in the group consisting of alanine (Ala or A), cysteine (Cys or C), aspartic 
acid (^p orD), glutamic acid (Glu or E), phenylalanine (Phe orF), glycine (Gly or G), 
histidine (His or H), isoleudne (lie or I), lysine (Lys or K), leucine (Leu or L), methionine 
(Met or M), asparagine (Asn or N), proline (Pro or P), ghatamine (Gin or Q), arginine (Arg or 
R), serine (Ser or S), threonine CThr or T), valine (Val or V), tryptophan (Xip or W), and 

30 tyrosine (Tyr or Y) residues. The terrainology used for identifying amino acid 
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positions/mutations is illustrated as follows: A15 (indicates an alanine residue in position 15 of 
the polypeptide), A15T (indicates replacement of the alanine residue in position 15 with a 
threonine residue), AlSfT/S] (indicates replacement of the alanine residue in position 15 with a 
feieonine residue or a serine residue). Multiple substitutions are indicated with a "+", e.g. 
5 A15T+F57S means an amino acid sequence which comprises a substitution of the alanine 
residue in position 15 for a threonine residue and a substitution of the phenylalanine residue in 
position 57 for a serine residue. 

The term "nucleotide sequence" is intended to indicate a consecutive stretch of two or 
more nucleotides. The nucleotide sequence can be of genonaic, cDNA, RNA, semisynthetic, 

10 synthetic origin, or any combinations thereof. 

"Cell", "host cell", "cell line" and "cell culture" are used interchangeably herein and all 
such terms should be understood to include progeny resulting from growth or culturing of a 
cell, 'Transformation" and "transfection" are used interchangeably to refer to the process of 
introducing DNA into a cell. 

15 "Operably linked" refers to the covalent joining of two or more nucleotide sequences in 

such a manner that the normal function of the sequences can be performed. For example, the 
nucleotide sequence encoding a piesequence or secretory leader is operably linked to a 
nucleotide sequence for a polypeptide if it is expressed as a preprotein that participates in the 
secretion of the polypeptide: a promoter or enhancer is opearably linked to a coding sequence if 

20 it affects the transcription of the sequence. 

"Introduction" or "removal" of a gjycosylation site or an attachment group for a non- 
peptide moiety is normally achieved by introducing or removing an amino add residue 
comprising or contributing to such site or group to/from the relevant amino acid sequence, 
conveniently by suitable modification of the encodmg nucleotide sequence. For instance, when 

25 an N-gJycosylation site is to be intcoduced/removed this can be done by introducing/removing 
a codon for the amino add residue(s) required for afunctional N-glycosylation site. When an 
attachment group for a PEG molecule is to be introduced/removed, it will be understood that 
this be done by introducing^emoving a codon for an amino acid residue, e.g. a lysine residue, 
comprising such group to^ram the encoding nucleotide sequence. The tenn "introduce" is 

30 primarily intended to include substitution of an existing amino add residue, but can also mean 
insertion of additional anaino acid residue. The term "remove" is primarily intended to include 
substitution of the amino add residue to be removed for another amino acid residue, but can 
also mean deletion (without substitution) of the amino acid residue to be removed 
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The tetm "epitope" is used in its conventional meaning to indicate one or moie amino 
add residue(s) displaying specific 3D and/or charge charactenstics at the surface of the 
polypeptide, which is/are capable of giving rise to an immune response in a mammal and/or 
specifically binding to an antibody raised against said epitope or which is/aie capable of giving 
5 rise to an alleigic response. 

The term "unshielded epitope" is intended to indicate that the epitope is not shielded 
and therefore has the above properties. The term "shielded epitope" is intended to indicate that 
the non-peptide moiety shields, and thus inactivates the epitope, whereby it is no longer 
capable of giving rise to any substantial immune response in a mammal, e.g. due to 
10 inappropriate processing and/or presentation in the antigen presenting cells, and/or of reacting 
with an antibody raised against the unshielded epitope. The shielding should thus be effective 
in botti the naive mammal and mammals that already produce antibodies reacting with the 
unshielded epitope. 

The degree of shielding of efpitopes can be determined as reduced immunogenicity 

15 and/or reduced antibody reactivity and/or reduced reactivity with monoclonal antibodies raised 
against the epitope(s) in question using methods known in the art The degree of shielding of 
allergenic epitopes can be determined, e.g., as described in WO 00/26354. 

The term "reduced" as used about an immunogenic or allexgic response is intended to 
indicate that a given molecule gives rise to a measuiably lower immune or allergic response 

20 &an a reference molecule, when determined under comparable conditions. Preferably, the 
relevant response is reduced by at least 25%, such as at least 50%, such as preferably by at 
least 75%, such as by at least 90% or even at least 100%. 

The term "serum half-life" is used in its normal meaning, i.e. the time in which half of 
the relevant molecules circulate in the plasma or bloodstream prior to being deaied. 

25 Alternatively used terms include "plasma half-life", "circulating half-life", "serum clearance", 
"plasma dearance" and "clearance half-life". The term "functional in vivo half-life" is the time 
m which 50% of a given function (such as biological activity) of the relevant molecule is 
retained, when tested in vivo (such as the time at which 50% of the biological activity of the 
molecule is still presait in the body/target organ, or the time at which the activity of the 

30 polypeptide is 50% of the initial value). The molecule is normally cleared by the action of one 
or more of the reticuloendothelial systems (RES), kidney (e.g. by glomerular filtration), spleen 
or liver, or receptor-mediated elimination, or degraded by specific or unspecific proteolysis. 
Normally, clearance depends on size or hydrodynamic volume (relative to the cut-off for 
glomerular filtration), shape/rigidity, charge, attached carbohydrate chains, and the presence of 
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cellular receptors for the molecule. The tenn "increased" as used about serum halffUfe or 
functional ire vivo half-life is used to indicate that the relevant half-life of the relevant molecule 
is statistically significantly increased relative to that of the reference molecule as determined 
under comparable conditions. For instance, the relevant half-life is increased by at least 25%, 
5 such as by at least 50%, by at least 100% or by at least 1000%. 

The term "function" is intended to indicate one or more specific functions of the 
polypeptide of interest and is to be understood qualitatively (i.e. having a similar function as 
the polypeptide of interest) and not necessarily quantitatively (i.e. the magnitude of the 
function is not necessarily similar). Typically, a given polypeptide has many different 

10 functions, examples of which are given further below in the section entitled "Screening for or 
measurement of function". For therapeutically useful polypeptides an important "function" is 
biological activity, e.g. in vitro or in vivo bioactivity. For enzymes, an important function is 
biological activity such as catalytic activity. 

The interchangeably used terms "measurable function" and 'functional" are intended to 

15 indicate that the relevant function (preferably reflecting the intended use) of a polypeptide of 
the invention is above detection limit when measured by standard methods known in the art, 
e.g, as an in vitro bioactivity and/or in vivo bioactivity. For instance, if the polypeptide is a 
hormone and the function of interest is the hormone's affinity towards a specific receptor a 
measurable function is defined to be a detectable affinity between the hoonone modified in 

20 accordance with the invention and the receptor as determined by the normal methods used for 
measuring such affinity, if the polypeptide is an enzyme and a function of interest is the 
catalytic activity a measurable function is the enzyme's ability to catalyze a reaction involving 
the normal substrates for the enzyme as measured by the normal methods for determining the 
enzyme activity in question. Typically, if not otiierwise stated herein, a measurable function is 

25 at least 2%, such as at least 5% of that of the unmodified polypeptide F]p, as detenmned under 
con^arable conditions, e.g. in the range of 2-1000%, such as 2-500% or 2-100%, such as 5- 
100% of that of the unmodified polypeptide. 

The term "functional site" is intended to indicate one or more amino acid residues 
which is/are essential for or otherwise involved in the function or perfoimance of the 

30 polypeptide, i.e. the amino add residue(s) that mediate(s) a desired biological activity of the 
polypeptide Pp. Such amino acid residues are "located at" the fimctional site. For instance, the 
flmctional site can be a binding site (e.g. a receptor-binding site of a hormone or growth factor 
or a ligand-binding site of a receptor), a catalytic site (e.g. of an enzyme), an antigen-binding 
site (e.g. of an antibody), a regulatory site (e.g. of a polypeptide subject to regulation), or an 
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interaction site (e.g. for a regulatory protein or an inhibitor). The functional site can be 
determined by methods known in the art and is conveniently identified by analysing a three- 
dimensional or model structure of the polypeptide complexed to a relevant ligand. 

The term "polypeptide" is intended to indicate any structural form (e.g. the primary, 
5 secondary or tertiary form (i.e. protein form)) of an amino acid sequence comprising more than 
5 amino acid residues, which may or may not be post-translationally modified (e.g. acetylated, 
carboxylated, phosphorylated, lipidated, or acylated). The interchangeably used terms "native" 
and "wild-type" are used about a polypeptide which has an amino acid sequence that is 
identical to one found in nature. The native polypeptide is typically isolated from a naturally 

10 occurring source, in particular a mammalian or microbial source, such as a human source, or is 
produced lecombinantly by use of a nucleotide sequence encoding the naturally occurring 
amino acid sequence. The term "native" is intended to encompass allelic variants of the 
polypeptide in question. A "variant" is a polypeptide, which has an amino acid sequence that 
differs from that of a native polypeptide in one or more amino acid residues. The variant is 

13 typically prepared by modification of a nucleotide sequence encoding the native polypeptide 
(e.g. to result in substitution, deletion or truncation of one or more amino acid residues of the 
polypeptide or by introduction (by addition or insertion) of one or more amino acid residues 
into the polypeptide) so as to modify the amino acid sequence constituting said native 
polypeptide. A "fragment" is a part of a parent native or variant polypeptide, typically differing 

20 from such parent in one or more removed C-tenoinal or N-terminal amino acid residues or 
removal of both types of such residues. Normally, the variant or fragment has retained at least 
one of the functions of the cdrresponding parent polypeptide (e.g. a biological function such as 
enzyme activity or receptor binding capability). Normally, the polypeptide Pp is a full length 
protem or a variant or fragment thereof. 

25 The term "antibody" includes single monoclonal antibodies (including agonist and 

antagonist antibodies) and antibotfy compositions witii polyepitopic specificaty (also tenned 
polyclonal antibodies). 

The tenn 'teonodonal antibody" is used in its conventional meaning to indicate a 
population of substantially homogeneous antibodies. The individual antibodies coniptised in 

30 the population have identical binding affinities and vary stincturally only to a limited extent. 
Monoclonal antibodies aie highly specific, being directed against a single epitope. 
Furthermore, in contrast to conventional (polyclonal) antibody preparations that typically 
include different antibodies directed against diffisrent epitopes, each monoclonal antibody is 
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directed against a single epitope on the antigen. The antibody to be modified is pcefexably a 
human or hmnanized monoclonal antibody. 

"Antibody firagmenr is defined as a portion of an mtact antibody coiiq)iising the 
antigen binding site or the entire or part of the variable region of the intact antibody, wherein 

5 the portion is Ixee of the constant heavy chain domains (i.e. CH2, CH3, and CH4, depending on 
antibody isotype) of the Fc regions of the intact antibody. Examples of antibody fragments 
include Fab, Fab', Fab'-SH, F(ab')2, and Fv fragments; diabodies; any antibody fragment that 
is a polypeptide having a primary structoe consisting of one uninterrupted sequence of 
contiguous amino acid residues (which may also be termed a single chain antibody fragment or 

10 a single chain polypeptide). 

Polypeptide of the invention 

In its first aspect the invention relates to a glycosylated polypeptide comprismg the primary 
15 structure, 

NHz-X-Pp^OOH, 
wherein 

X is a peptide addition comprising or contributing to a glycosylation site, and Fp is a 
polypeptide of interest. 

20 In one embodiment the polypeptide consists essentially of or consists of a polypeptide 

with the primary structure NH2-X-Pp-C00H. 

The peptide addition according to this aspect is preferably one, which has less than 90% 
identity to a native full length protein. The identity is determined on the basis of an alignment 
of the peptide addition to the entire amino acid sequence of the full length native protein, the 

25 alignment being made to ensure the highest possible degree of identity between amino acid 
residues. For instance, the program OLUSTALW version 1 .74 using default parameters 
CThompson et al., 1994, CLUSTAL W: improving the sensitivity of progressive multiple 
sequence alignment through sequence weighting, position-specific gap penalties and weight 
matrix choice. Nucleic Acids Research, 22:4673-4680) can be used. 

30 Usually, the peptide addition is fused to the N-terminal end of the polypeptide Pp as 

reflected in the above shown structure so as to provide an N-terminal elongation of the 
polypeptide Pp. Howevar, it is also possible to insert the peptide addition within the amino acid 
sequmce of the polypeptide Pp. This is reflected in the polypeptide according to the second 
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aspect of the invention, wherein the polypeptide comprises the primary structure NHa-Px-X-Py- 
COOH, wherein 

Px is an N-tenninal part of a polypeptide Pp of interest, 
Py is a C-tenninal part of said polypeptide Pp, and 
5 X is a peptide addition comprising or contributing to a glycosylation site. 

Iq one embodiment the polypeptide consists essentially of or consists of a polypeptide 
with the primary structure ^?H2-PJl-X-Py-C00a 

In order to minimize structural changes effected by the insertion of the peptide addition 
within the sequence of the polypeptide Pp, it is desirable that it be inserted in a non-structural 
10 part thereof. For instance, is a non-structural N-tenninal part of a mature polypeptide Pp, 
and Py is a structural C-terminal part of said mature polypeptide, or Px is a structural N-tenninal 
part of a mature polypeptide Pp, and Py is a non-structural C-terminal part of said mature 
polypeptide. Preferably, when the glycosylation site to be mtroduced is an N-glycosylation site, 
Px is a non-structural N-teiminal part since, m general, the best N-glycosylation is obtained in 
15 the N-terminal part of a polypeptide. 

When the peptide addition comprises only few ainino acid residues, e.g. 1-5 such as 1-3 
amino acid residues, and in particular 1 amino acid residue, the peptide addition can be inserted 
into a loop strucmre of the polypeptide Pp and thereby elongate said loop. When the peptide 
addition is constituted by one amino add residue it will be understood that this is selected so as 
20 to ensure that a functional glycosylation site is introduced. 

Polypeptides of the invention are glycosylated polypeptides. NormaiUy, the peptide 
addition part of the polypeptide of the invention has attached at least one oligosaccharide 
moiety. The polypeptide Pp part of the polypeptide may or may not have attached at least one 
oligosaccharide moiety. Glycosylation can be achieved as described in the section entitled 
25 "Glycosylation" 

Preferably, the polypeptide of the invention has properties such as size, charge, 
molecular weight and/or hydrodynamic volume that are sufficient to reduce or escape clearance 
by any of the clearance mechanisms disclosed herein, in particular renal cleranoe. Such 
properties are, e.g., determinable by the nature and number of oligosaccharide and second non- 
30 peptide moieties attached thereto. In one embodiment, the polypeptide of the invention has a 
molecular weight of at least 67 kDa, in particular at least 70 kDa as measured by SDS-PAGE 
according to Laemmli, U.K., Nature Vol 227 (1970), p680-85. This is of particular relevance 
when the polypeptide of interest is a thenq)eutically useful protein, the functional in vivo half- 
life of which is to be prolonged. A molecular weight of at least 67 kDa is obtainable by 
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introduction of a sufScient number of glycosylation sites to obtain a glycosylated polypeptide 
with such Mw, or by conjugating the glycosylated polypeptide to a sufficient number and type 
of a second non-peptide moiety to obtain such Mw. For instance, for a glycosylated 
polypeptide of interest having a molecular weight of at least 2S kDa linked to a peptide 

5 addition of 2 kDa, the combined extended polypeptide having at least two PEG-attachment 
groups, conjugation to two or more PEG molecules each having a molecular weight of 20 kDa 
results in a total molecular weight of at least 67 kDa. 

Preferably, the polypeptide of the invention has at least one of the following properties 
relative to the polypeptide Pp, the properties being measured under comparable conditions: 

10 in vitro bioactivity which is at least 25%, such as at least 30% or at least 45% of that of the 
polypeptide Pp as measured under comparable conditions, increased affinity for a raannose 
receptor, a mamiose-6-phosphate receptor or other carbohydrate receptors, increased serum 
half-life, increased ftmctional in vivo half-life, reduced renal clearance, reduced 
Lmmunogenicity, increased resistance to proteolytic cleavage, improved targeting to lyspsomes, 

15 macrophages and/or other subpopulations of human cells, improved stability in production, 
improved shelf life, improved formulation, e.g. liquid formulation, improved purification, 
improved solubility, and/or improved expression. 

Improved properties are determined by conventional methods known in the art for detennining 
such properties. The improvement is of a magnitude that is within detection limits. 

20 

Improved affinity for or uptake by the mannose receptor is expected to result in increased 
uptake in phagocytic cells, preferably monocytes, macrophages (e.g. Kupffer cells, 
glia/mikroglia, alveolar phagocytes, reticulum cells, or other peripheral macrophages) or 
macrophage like cells (jSor instance osteoclasts, dendritic cells, or astrocytes) in increased 
25 uptake of the polypeptide in phagocytic cells (e.g. macrophages). This is of particular relevance 
when the polypeptide of interest is one for which such uptake is required for the polypeptide to 
exert its biological activity. Such polypeptide is e.g. an antigen mtended for use for vaccine 
purposes or a lysosomal enzyme. 

30 Polypeptide of mterest 

The present invention can be applied broadly. Thus, the polypeptide of interest can have any 
function and be of any origin. Accordingly, the polypeptide can be a protein, in particular a 
mature protein or a precursor form thereof or a functional fragment thereof that essentially has 
retained a biological activity of the mature protein. Furthermore, the polypeptide can be an 
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oligopeptide that contains in the lange of 30 to 4500 amino acids, preferably in the range of 40 
to 3000 amino acids. 

The polypeptide can be a native polypeptide or a variant thereof. For instance, the 
polypeptide is a variant that comprises at least one introduced and/or at least one removed 
5 glycosylation site as compared to the corresponding native polypeptide. The variant has 
retained at least one function of the corresponding native polypeptide, in particular a biological 
activity thereof. 

The polypeptide can be a therapeutic polypeptide useful in human oi: veterinary 
therapy, i.e. a polypeptide that is physiologically active when introduced into the circulatory 
10 system of or otherwise administered to a human or an animal; a diagnostic polypeptide useful 
in diagnosis; or an industrial polypeptide useful for industrial purposes, such as in the 
manufacture of goods wherein the polypeptide constitutes a functional ingredient or wherein 
the polypeptide is used for processing or other modification of raw ingredients during the 
manufacturing process. 

15 The polypeptide can be of mammalian origin, e.g. of human, porcine, ovine, urcijie, 

murine, rabbit, donkey, or bat origin, of microbial origin, e.g. of fungal, yeast or bacterial 
origin, or can be derived from other sources such as venom, leech, firag or mosquito origin. 
Preferably, the industrial polypeptide of interest is of microbial origin and the therapeutic 
polypeptide of human origin. 

20 Specific examples of groups of polypeptides to be modified according to the invention 

include: an antibody or antibody f ragme nt, an immunoglobulin or iramuno^obulin fragment, a 
plasma protein, an erythrocyte or thrombocyte protein, a cytokine, a growth factor, a 
profibrmolytic protdn, a binding protein, a protease inhibitor, an antigen, an enzyme, a ligand, 
a receptor, or a hormone. Of particular interest is a polypeptide that mediates its biological 

25 effect by binding to a cellular receptor, when administered to a patient. The antibody can be a 
polyclonal or monoclonal antibody, and can be of any origin including human, rabbit and 
murine origin. Prefcrably, the antibody is a human or humanized monoclonal antibody. 
Immunoglobulins of interest include IgG, IgE, IgM, IgA, and IgD and fragments thereof; e.g. 
Fab fragments. Specific antibodies and fragments thereof are those reactive with any of the 

30 proteins mentioned immediately below. 

The non-antibody polypeptide of inteaBst can be i) a plasma protein, e.g. a factor from 
the coagulation system, such as Factor VII, Factor Vm, Factor IX, Factor X, Factor XE, 
thrombin, protein C, antithrorabin HE or heparin co-factor II, Tissue factor inhibitor (e.g, 1 or 
2), endothelial cell surface protein C receptor, a factor from the fibrinolytic system such as pro- 
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urokinase, urokinase, tissue plasminogen activator, plasminogen activator inhibitor 1 (PAI-1) 
or plasminogen activator inhibitor 2 (PAI-2), the Von Wiliebrand factor, or an a-l-proteinase 
inhibitor, ii) a erythrocyte or thrombocyte protein, e.g. hemoglobin, thrombospondin or platelet 
factor 4, iii) a cytokine, e.g. an interleukm such as IL-1 (e.g. IHa or IL-IP), IL-2, IL-4, TL-5, 
5 IL-6, 11^9, IL-10, IL-1 1, E^12, IL-13, IL-15, TL-16, IH7, IL-18, IH9, Ev-20, IL-21, IL-22, 
lL-23, a cytokine-related pol}^eptide, such as IL-lRa, an interferon such as interferon-o, 
interferon-P or interferon-y, a colony-stimulating factor such as GM-CSF or G-CSF, stem ceU 
factor (SCF), a binding protein, a member of the tumor necrosis factor family (e.g TNF-ct, 
lymphotoxin-a, lynqihotoxin-P, FasL, CD40L, CD30L, CD27L, Ox40L, 4-lBBL, RANKL, 

10 TRAIL. TWEAK, LIGHT, TRANCE, APRIL, THANK or TALUl), iv) a growth factor, e.g 
platelet-derived growth factor (PDGF), transforming growth factor a (TGF-a), transforming 
growth factor p CTGF-P). epidermal growth factor (EGF), vascular endothelial growth factor 
(VEGF)> somatotropin (growth hormone), a somatomedin such as insulin-like growth factor! 
gnP-I) or msulin-like growth factor II (IGF-II), erythropoietin (EPO), thiombopoietiii (TPO) 

15 or angiopoietin, v) a profibrinolytic protein, e.g. staphylokinase or streptokinase, vi) a protease 
inhibitor, e.g. aprotimn or CI-2A, vii) an enzyme, e.g. superoxide dismutase, catalase, uricase, 
bilirubin oxidase, trypsin, papain, asparaginase, argmase, arginine deiminase, adenosin 
deaminase, ribonuclease, alkaUne phosphatase, p-glucuronidase, purine nucleoside 
phosphorylase or batroxobin, viii) an opioid, e.g. endorphins, enkephalins or non-natural 

20 opioids, ix) a hormone or neuropeptide, e.g. insulin, calcitonin, glucagons, adrenocorticotropic 
hormone (ACTBO, somatostatin, gastrins, cholecystokinins, parathyroid hormone (PTH), 
luteinizing hormone (LH), follicle-stimulating hormone (FSS), gonadotropin-releasing 
hormone, chorionic gonadotropm, corticotropin-releasing factor, vasopressin, oxytocin, 
antidiuretic hormones, thyroid-stimulating hormone, thyrotropin-releasing hormone, relaxin, 

25 glucagon-Uke peptide 1 (GLP-1), glucagon-like peptide 2 (GLP-2), prolactin, neuropeptide Y, 
peptide YY, pancreatic polypeptide, leptm, orexin, CART (cocaine and amphetamine regulated 
transcript), a CART-rdated peptide, meianocortms (melanocyt&nStimulatmg hormones), 
melanin-concentrating hormone, natriuretic peptides, adrenomedullin, endothelin, exendin, 
secretin, amyHn (lAPPjislet amyloid polj^ptide precuisor), vasoactive intestinal peptide 

30 (VIP), pituitary adenylate cyclase activating pdypeptide (PACAP), agouti and agouti-related 
peptides or somatotropin-ieleasing hormones, or x) another type of protein or peptide such as 
thymosin, bombesin, bombesin-like peptides, heparin-binding protein, soluble CD4, 
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pigmentary hoimones, hypothalaxnic releasing factor, malanotonins, phospholipase activating 
protein, a detoxifying enzyme sucti as acyloxyacyl hydrolase, or an antimicrobial peptide. 

One group of polypeptides of particular interest in the present invention is selected 
from the group of lysosomal enzymes (as defined in US 5,929,304) such as those responsible 
5 for or otherwise involved in a lysosomal storage disease, i.e. enzymes that have a therapeutical 
effect on patients with a lysosomal storage disease. Such enzymes, e.g. include 
glucocerebrosidase, a-L-iduronidase, add ot-glucosidase, a-galactosidase, acid 
sphingomyelinase, galactocerebrosidase, arylsulphatase A, sialidase, and hexosaminidase. 
Also, other proteins involved in lysosomal storage diseases such as Saposin A, B, C or D 

10 (Nakano et al., J. Biochem. (Tokyo) 105, 152-154, 1989; Gavrieli-Rorman and Giabowski, 
Genomics 5, 486-492, 1989) can be modified as described herein. Preferably, these 
polypeptides are of human origin. 

The present inventors have shown that providing such enzymes with additional N- 
linked oligosaccharide moieties considerably improve properties thereof, such as stabiUty, 

15 tai^eting, expression, and in vivo activity and targeting. Accordingly, in one embodiment the 
polypeptide of the invention is a glycosylated lysosomal enzyme comprising a peptide addition 
comprising or contributuig to a glycosylation site. 

The industrial polypeptide is typically an enzyme, in particular a microbial enzyme, and 
can be used in products or in the manu&cture of products such as detergents, household 

20 articles, personal care products, agrochemicals, textile, food products, in particular bakery 
products, feed products, or in industdal processes such as hard surface cleaning. The industrial 
polypeptide is normally not intended for internal administration to humans or animals. Specific 
examples include hydrolases, such as proteases, lipases or cutinases, oxidoreductases, such as 
laccase and peroxidase, transferases such as transglutanimases, isomerases, such as protein 

25 disulphide isomerase and glucose isomerase, cell wall degrading enzymes such as cellulases, 
xylanases, pectinases, mannanases, etc., amylolytic enzymes such as endoamylases, e.g. alpha- 
amylases, oar exo-amylases, e.g. beta-amylases or amylogjucosidases, etc. Fuithex specific 
examples are those listed in WO 00/26354, the contents of which are incorporated herein by 
reference. Normally, an enzyme modified according to the present invention has one or more 

30 improved properties selected from the group consisting of increased stability Cin particular 
against proteolytic degradation or thennal degradation) leading to, e.g., improved shelf life and 
improved performance in use; improved production, e.g. in terms of improved expression (e.g. 
as a consequence of in^roved secretion and/or increased stability of the expressed enzyme) 
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and improved purification, decreased allogenicity, increased activity in the relevant industrial 
process in which it is used, and improved properties with respect to inunobilization. 

When the polypeptide Pp is an industrial enzyme the N-terminal peptide addition may 
comprise or contribute to a glycosylation site. However, it is also within the scope of the 
5 present invention to provide a polypeptide comprising an industrial enzyme and a C-tenninal 
or N-tenninal peptide addition comprising an attachment group for a second non-peptide 
moiety being a polymer, e.g. PEG. The peptide addition may or may not comprise a 
glycosylation site. The peptide addition is preferably as described herein. For instance, such 
attachment group can be provided by a lysine or cysteine residue. 
10 In one embodiment the polypeptide of the invention comprises a personal care enzyme 

(i.e. an enzyme useful for personal care appKcations), which polypeptide is incapable of 
passing the mucous membrane of a mammal in particular a human exposed to the polypeptide. 
Thereby, aUergenicity can be reduced or avoided. Furthermore, stability of such enzyme can be 
increased. The polypeptide according to this embodiment comprises an N-terminal or C- 
15 terminal peptide addition comprising or contributing to a glycosylation site and/or an 
attachment group for a second non-peptide moeity, e.g. a polymer such as PEG. 

In another embodiment the polypeptide comprises a lipase as disclosed in WO 
97/04079, in particular a Humicola lanuginosa lipase, wherein the N- or C-tenninal peptide 
addition comprises a glycosylation site and/or at least one attachment group for a second non- 
20 peptide moeity, e.g. a polymer such as PEG. Thereby, the N- or C-tenninal peptide addition is 
- shielded jirom degradation and/oa: increased expression, including secretion, of the enzjime is 
likely to be obtained. In connection with this embodimrait the N-terminal peptide addition can 
conq)rise any of the peptide additions disclosed in WO 97/04079. 

fa yet another embodiment the polypeptide Pp is an amyloglucosidase and the N- or C- 
25 terminal peptide addition comprises or contribates to a glycosylation site and/or an attachment 
grotj> for a second non-peptide moeity, e.g. a polymer such as PEG. When the peptide addition 
is N-tenninal the modification of such enzyme is conten^lated to result in reduced or no 
degradation of the N-terminus of said enzyme (an otherwise well laiowii problem associated 
with the recombinant production of amyloglucosidase). Jn other words the N-terminus of the 
30 enzyme is protected by the non-peptide moiety attached to the N-terminal peptide addition of 
the amyloglucosidase. 

In yet anotha: embodiment the polypeptide Pp is an antigen, in particular an antigen 
intended for use in eliciting an immune response (for vaccine purposes). It is contemplated to 
be advantageous to add N-terminal glycosylation site(s) to antigens in accordance with the 
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invention in that the risk of changing antigenicity is thereby reduced Antigens are recognized 
by a wide range of target cells, including antigen presenting cells (APC), and taken up by those 
cells for efficient intracellular processing and presentation to Other cells of the immune system, 
such as, e.g., T cells, to induce or elicit desired immune responses. Antigens (and fiagments 
S thereof, e.g., antigen peptides) can be modified by a peptide addition and non-peptide moieties 
according to the invention. Such modifications fecilitate and/or optiimze uptake and/or 
targeting to processing compartment of the antigen by such target cells. For example, N- 
tenninally extended antigen polypeptides of the invention are taken up by the target cells more 
efficiently and/or at an enhanced or improved rate (when the non-peptide moiety is one 

10 involved in such uptake). Such efficient, improved, or enhanced uptake of modified antigens 
by the target cells increases the kinetics and potency of the immune response to the 
immunizing antigen. These modifications to antigens also improve the affinity of the antigens 
for particular cellular receptors on target cells, including, e.g., mannose receptors and other 
carbohydrate receptors (in particular when the non-peptide moiety is an oligosaccharide 

15 moiety). 

Antigen polypeptides of the invention include, but are not limited to those, for which an 
improved, enhanced or altered uptake of antigens in the following type of target cells is 
desired: antigen-presenting and antigen-processing cells, such as monocytes, B cells, antigen- 
presenting macrophages, marginal zone macrophages, follicular dendritic cells, dendritic cells, 

20 Langerhans cells, keratinocytes, M-cells (e.g., M-cells of the gut), myocytes for intramuscular 
immxmization or epithelial cells for mucosal immunization, Kuppfer cells in the liver, and the 
like. A number of other cells, including capillary endothelium and some endocrine cells, can 
present antigen in some circumstances; the cells develop MHC class H molecules that confer 
antigen-presenting function. Furthennore, MHC class I molecules are expressed on the surface 

25 of most nucleated cells, including, for example, muscle cells, and therefore these cells can also 
present antigens to CDE+ T cells. Activated T cells, which release IFN-gamma actively induce 
expression of MHC molecules on some tissue cells. Such cells are also of use with the novel 
polypeptides of the invention. Preferably, such cells are of mammalian origuoi, in particular 
human (for use in immunization of a human) or animal (for veterinary purposes), 

30 A wide range of antigens can be modified accoiding to the invention. Examples are as follows: 

Cancer antigens 

Examples of cancer antigens that can be modified according to the invention include, 
but are not limited to: bullous pemphigoid antigen 2, prostate mucin antigen (PMA) (Bectett 
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and Wright (1995) Int. J. Cancer 62: 703-710), tumor associated Thonoseai-Friedenidch 
antigen (DaWenborg et cd. (1997) Int. J. Cancer 70: 63-71), prostate-specific antigen (PSA) 
(DannuU andBelldegran (1997) Br. J. Urol. 1: 97-103), EpCam/KSA antigen, luminal 
epithelial antigen (LEA. 135) of breast carcinoma and bladder transitional ceD carcinoma 
5 (TCC) (Jones et al. (1997) Anticancer Res. 17: 685-687), cancer-associated serum antigen 
(CASA) and cancer antigen 125 (CA 125) (Kierkegaard etal. (1995) Gynecol Oncol. 59: 251- 
254), the epithelial glycoprotein 40 (EGP40) (Kievit et al. (1997) Int. J. Cancer 71: 237-245), 
squamous cell carcinoma antigen (SCC) (Lozza. etal. (1991) Anticancer Res. 17: 525-529), 
cathepsin E (Mota er aZ. (1997) Am. J. Pathol. 150: 1223-1229), tyrosinase in. melanoma 

10 (Fishman et al. (1997) Cancer 79: 1461-1464), cell nuclear antigen (PCNA) of cerebral 
cavemomas (Notelet et al (1997) Surg. Neurol. 47: 364-370), DF3/MUC1 breast cancer 
antigen (Apostolopoulos etal. (1996) Immunol Cell Biol 74: 457-464; Pandey etai (1995) 
Cancer Res. 55: 4000-4003), carcinoembryonic antigen (Paone aZ. (1996) 7. Cancer Res. 
ain. Oncol 122: 499-503; Schlom et al. (1996) Breast Cancer Res. Treat. 38: 27-39), tumor- 

15 associated antigen CA 19-9 (ToUiver and O'Brien (1997) South Med. J. 90: 89-90; Tsuruta et 
al (1997) Urol Int. 58: 20-24), human melanoma antigens MART-l/Melan-A27-35 and gplOO 
(Kawakami and Rosenberg (1997) /nf. Rev. Immunol. 14: 173-192; Zajac etal. (1997) Ifif. /. 
Cancer 71: 491-496), the T and Tn pancarcinoma (CA) glycopeptide epitopes (Springer (1995) 
Crit. Rev. Oncog. 6: 57-85), a 35 kD tumor-associated autoantigen in papillary thyroid 

20 carcinoma (Lucas et al (1996) Anticancer Res. 16: 2493-2496), KH-1 adenocarcinoma antigen 
(Deshpande andDanishefsky (1991) Nature 387: 164-166), the A60 mycobacterial antigen 
(Maes et al. (1996) J. Cancer Res. ain. Oncol 122: 296-300), heat shock proteins (HSPs) 
(Blachete and Srivastava (1995) Semin, Cancer Bid. 6: 349-355), and MAGE, tyrosinase, 
melan-A and gp75 and mutant oncogene products (eg., p53, ras, and HER-2/neu (Bueler and 

25 Mulligan (1996) Mol Med. 2: 545-555; Lewis and Hougjiton (1995) Semin. Cancer Biol 6: 
321-327; Theobald et al. (1995) Proc. Nat'l Acad. Sci. USA 92: 11993-11997); TAG-72, a 
mucin ag expressed in most human adenocarcinomas (McGuinness et al (1999) Hum Gene 
Ther 10:165-73. 

30 Bacterial antigens 

Bacterial antigens that can be modified according to the invention include, but are not 
limited to, Helicobacter pylori antigens CagA and VacA (Blaser (1996) Aliment. Plmrmacol 
Ther. 1: 73-7; Blaser and Crabtiee (1996) Am. /. Clin. Pathol 106: 565-7; Censini etal. (1996) 
Proc, Nat'l Acad. Sci. USA 93: 14648-14643). Other suitable H. pylori antigesos include, for 
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example, four inomunoreactive proteins of 45-65 KDa as reported by Chatha et ed. (1997) 
Indian J. Med. Res. 105: 170-175 and the H. pylori GroES homologue (HspA) (Kansau et at. 
(1996) Mol Microbiol. 22: 1013-1023. Other suitable bacterial antigens include, but are not 
limited to, the 43-kDa and the fimbrilin (41 kDa) proteins of P. gingivalis (Boutsl et al. (1996) 
5 Oral Microbiol. Immunol. 1 1: 236-241); pneumococcal surface protdn A (Biiles et al. (1996) 
Ann. NYAjcad, Set. 797: 118-126); Chlamydia psittaci antigens, 80-90 kDa protein and 110 
kDa protein (Buendia et al. (1997) FEMS Microbiol. Lett. 150: 113-9); the chlamydial 
exoglycoUpid antigen (GLXA) (Whittum-Hudson et al. (1996) Nature Med. 2: 1116-1121); 
Chlamydia pneumoniae species-specific antigens in the molecular weight ranges 92-98, 51-55, 

10 43-46 and 31.5-33 kDa and genus-specific antigens in the ranges 12, 26 and 65-70 kDa (Halme 
etal (1997) Scand. J. Immunol. 45: 378-84); Neisseria gonorrhoeae (GC) or Escherichia coli 
phase-variable opacity (Opa) proteins (Chen and Gotschhch (1996) Proc. Nat'l. Acad. Sci. 
USA 93: 14851-14856), any of the twelve immunodominant proteins of Schistosoma mansoni 
(ranging in molecular weight from 14 to 208 kDa) as described by Cutts and Wilson (1997) 

15 Parasitology 1 14: 245-55; the 17-kDa protein antigen of Brucella abortus (De Mot et al. 
(1996) Curr. Microbiol. 33: 26-30); a gene homolog of the 17-kDa protein antigen of the 
Gram-negative pathogen Brucella abortus identified in the nocardiofotm actinomycete 
Rhodococcus sp. NI86/21 (De Mot et al. (1996) Curr. Microbiol. 33: 26-30); the 
staphylococcal enterotoxins (SEs) (Wood cr a/. (1997) FEMS Immunol. Med. Microbiol. 17: 1- 

20 10), a 42-kDa M. hyopneumoniae NrdF ribonucleotide reductase R2 protein or 15-kDa subunit 
protdn of M. hyopneumoniae (Pagan et d. (1997) Infect. Inanun. 65: 2502-2507), the 
meningococcal antigen PorA protein (Feavers et al. (1997) Oin. Diagn. Lab. Immunol 3: 444- 
50); pneumococcal surface protein A (PspA) (McDaniel et dl. (1997) Gene Ther. 4: 375-377); 
F. tularensis outer membrane protdn FopA et al. (1996) FEMS Immunol. Med. 

25 Microbiol. 13: 245-247); the major outer membrane protein within strains of the genus 

Actinobadllus (Hartmann etal. (1996) Zentrdlbl. Bakteriol. 284: 255-262); p60 or Usteriolysin 
(Hly) antigen of Listeria monocytogenes (Efess et al, (1996) Proc. Nat'l Acad. Sci. USA 93: 
1458-1463); flagellar (G) antigens observed on Salmonella enteritidis and S. puUonm (Holt 
and Chaubal (1997) J. Clin. Microbiol. 35: 1016-1020); Bacmus antkracis protective antigen 

30 (PA) avins et al. (1995) Vacdne 13: 1779-1784); Echinococcus granulosus antigen 5 (Jones et 
al. (1996) Parasitology 113: 213-222); the rol genes of Shigella dysenteriae 1 mdEscherichia 
coli K-12 (Klee etal. (1997) /. Bacterial 179: 2421-2425); cell surface proteins Rib and alpha 
of group B streptococcus (Larsson et al. (1996) Infect. Immun. 64; 3518-3523); the 37 kDa 
secreted polypeptide encoded on the 70 Kb virulence plasmid of pathogenic Yersinia spp. 
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(Leary et al. (1995) Coiitrib. Microbiol. Mmunol. 13: 216-217 and Roggenkamp et cd. (1997) 
7/z/ec/. liwOTMra. 65: 446-51); the OspA (outer surface protein A) of the Lyme disease spirochete 
Borrelia burgdorferi (U et cd. (1997) Proc. Nat'l. Acad. Sd. USA 94: 3584-3589, PadiUa et ed. 

(1996) J. Infect. Dis. 174: 739-746, and WalKch etcd. (1996) InfecUon 24: 396-397); the 
5 Brucella melitensis group 3 antigen gene encoding Omp28 (tindler et al. (1996 ) Infect. 

Immun. 64: 2490-2499); the PAc antigen of Streptococcus muUms (Murakami et cd. (1997) 
Infect. Immun. 65: 794-797); pneumolysin. Pneumococcal neuraminidases, autolysin, 
hyalunmidase, and the 37 kDa pneumococcal surface adhesin A (Paton et a/. (1 997) Microb. 
Drug Resist. 3: 1-10); 29-32, 41-45, 63-71 x 10(3) MW antigens of Salmonella typhi (Perez et 
10 al. (1996) Immunology 89: 262-267); K-antigen as a marker of Klebsiella pneumoniae 

(Priamukhina and Morozova (1996) Klin. Lab. Diagn. 47-9); nocardial antigens of molecular 
mass approximately 60, 40, 20 and 15-10 kDa (Prokesova et al. (1996) Int. J. 
Immunopharmacol. 18: 661-668); Staphylococcus aureus antigen ORF-2 (Rieneck et al. 

(1997) Biochim Biophys Acta 1350: 128-132); GlpQ antigen of Borrelia hermsii (Schwan et al 
15 (1996) J. ain. Microbiol. 34: 2483-2492); cholera protective antigen (CPA) (Sciortino (1996) 

J. Diarrlioeal Dis. Res. 14: 16-26); a 190-kDa protein antigen of Streptococcus mutans 
(Senpuku etal. (1996) Oral Microbiol Immunol 11: 121-128); Anthrax toxin protective 
antigen (PA) (Sharma et al. (1996) Protein Expr. Purif, 7: 33-38); aostridiumpetjringens 
antigens and toxoid (Strom et al (1995) Br. J. Rheumatol 34: 1095-1096); the SEF14 fimbiial 

20 antigen of Salmonella enteritidis CThoms et al. (1996) Microb. Pa^g. 20: 235-246); the 
Yersinia pestis capsular antigen (Fl antigen) (Titball et al (1997) Inject, hnmun. 65: 1926- 
1930); a 35-kilodalton protein of Mycobacterium l^rae (lYiccas et al (1996) Infect. Imnmn. 
64: 5171-5177); the major outer membrane protein, CD, extracted from Moraxella 
(BranhameUa) catarrhalis (Yang et cd. (1997) FEMS Immunol Med. Microbiol 17: 187-199); 

25 pH6 antigen (PsaA protein) of Yersinia pestis (Zav'yalov et dl. (1996) FEMS Immunol Med. 
Microbiol 14: 53-57); a major surface glycoprotein, gp63, o£ Leishmania major (Xu and Uew 
(1994) Vacdne 12: 1534-1536; Xu and.liew (1995) /w/nMnoZtJgy 84: 173-176); mycobacterial 
heat shock protein 65, mycobacterial antigen (Mycobactenwn leprae hsp65) (Lowrie et cd. 
(1994) Vaccine 12: 1537-1540; Ragno et dl. (1997) Arthritis Rheum. 40: 277-283; Silva (1995) 

30 Braz. J. Med. Biol Res. 28: 843-851); Mycobacterium tuberculosis antigen 85 (Ag85) (Huygen 
et al. (1996) NaL Med. 2: 893-898); the 45/47 kDa antigen complex (APA) of Mycobacterium 
tuberculosis, M. bavis and BCG (Horn et al (1996) J. Immunol Methods 197: 151-159); the 
mycobacterial antigen, 65-kDa heat shock protein, hsp65 (Tascon et al. (1996) Nat. Med. 2: 
888-892); the mycobacterial antigens MPB64, MPB70, MPB57 and alpha antigen (Yamada et 
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a/. (1995) Keldazku 70; 639-644); the M tuberculosis 38 kDa protdn (Vordennedea: a/. 
(1995) Vaccme 13: 1576-1582); the MPT63, MPT64 and MPT-59 antigens fixMn 
Mycobacterium tuberculosis (Manca et ed. (1997) Infect. Lnmun. 65: 16-23; Oettinger et cd. 
(1997) Scand. J. Immunol. 45: 499-503; Wilcke et ed. (1996) Tuber. Lung Dis. 77; 250-256); 
5 the 35-kilodalton protein of Mycobacterium leprae (Triccas et cd. (1996) Infect. Immun. 64: 
5171-5177); the ESAT-6 antigen of virulent mycobacteria (Brandt et cd. (1996) 3. Immimol. 
157: 3527-3533; Pollock and Andersen (1997) /. Infect. Bis. 175: 1251-1254); Mycobacterium 
tuberculosis 16-kDa antigen (EIspl6.3) (Chang et al (1996) J. Biol. Chem. 271: 7218-7223); 
and the 18-kilodalton protein of Mycobacterium leprae (Baumgart ef al. (1996) /«/ec?. /mmHn. 
10 64: 2274-2281); protective antigen (PA) of B. anthracis; V antigen from Yersinia pestis, Y. 
enterocolitica, and Y. pseudotuberculosis; antigens against bacterium Vibrio cholerae, cholera 
toxin B subunit, and heat-labile enterotoxins (LT) fcom enterotoxigenic E. coli strains. 

Vircd pathogens 

15 Polypeptides or proteins corresponding to or associated with various viral pathogens, 

including, but not limited to, e.g., hanta virus (e.g., hanta virus glycoproteins), flaviviruses, 
such as, e.g.. Dengue viruses (e.g., envelope proteins), Japanese, St. Louis and Murray Valley 
encephalitis viruses, tick-home encephalitis viruses can be modified according to the invention. 
Viral antigens that can be modified according to the invention include, but are not 

20 limited to, influenza A virus N2 neuraminidase (Kilboume et al. (1995) Vaccine 13: 1799- 
1803); Dengue virus envelope (E) and premembrane (prM) antigens (Beighny et al. (1994) Am. 
J. Trap. Med. Hyg. 50: 322-328; Putnak et al. (1996) Am. J. Trop. Med. Hyg. 55: 504-10); HTV 
antigens Gag, Pol, Vif and Nef (Vogt etal. (1995) Vaccine 13: 202-208); HIV antigens gpl20 
and gpl60 (Achour et al. (1995) Cell. Mol. Biol. 41: 395-400; Hone et al. (1994) Dev. Biol. 

25 Stand. 82: 159-162); gp41 epitope of human immunodeficiency virus (Bckhart et oL. (1996) J. 
Gen. Virol. 77: 2001-2008); rotavirus antigen VP4 (Mattion et al. (1995) J. Virol. 69: 5132^ 
5137); the rotavirus protein VP7 or VP7sc (Emslie et cd. (1995) J. Virol. 69: 1747-1754; Xu et 
(d. (1995)7. Gen. Virol. 76: 1971-1980); herpes simplex virus (HSV) glycoproteins gB, gC, 
gD, gE, gG, gH, and gl CFIeck et id. (1994) Med Microbiol Immunol. (Berl) 183: 87-94 

30 [Mattion, 19951; Ghiasi et al (1995) Invest. Ophthalmol Vis. Sci. 36: 1352-1360; McLean et 
cd. (1994) J. Infect. Dis. 170: 1100-1109); immediate-early protein ICP47 of herpes simplex 
virus-type 1 (HSV-1) (Banks et al. (1994) Virology 200: 236-245); immediate-early QE) 
proteins ICP27, ICPO, and ICP4 of herpes simplex virus (Manickan et al (1995) /. Virol 69: 
4711-4716); influenza virus nucleoprotein and hemagglutinin (Deck et al. (1997) Vaccine 15: 
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71-78; Fu et al (1997) J. Virol. 71: 2715-2721); B19 parvovirus capsid protems VPl (Kawase 
et al. (1995) Virology 211: 359-366) or VP2 (Brown et al. (1994) Virology 198: 477-488); 
Hq)atitis B virus core and e antigen and capsid protein (Schodel et al. (1996) Intervirology 39: 
104-106); hepatitis B surface antigen (Shiau and Mmiay (1997) J. Med. Virol 51: 159-166); 

5 hepatitis B surface antigen fused to the core antigen of the virus (Jd.); Hepatitis B virus core- 
prBS2 particles (Nemeckovaef aZ. {1996) Acta Virol. 40: 273-279); HBV piBS2-S protein 
(Kutinova et al (1996) Vaccine 14: 1045-1052); VZV glycoprotein I (Kutinova et al (1996) 
Vaccine 14: 1045-1052); rabies virus glycoproteins (Xiang et al (1994) Virology 199: 132- 
140; Xuan et al (1995) Virus Res. 36: 15 1-161) or ribonucleocapsid (Hooper et al (1994) 

10 Proc. Nat'l Acad. Sci. USA 91: 10908-10912); human cytomegalovirus (HCMV) glycoprotem 
B (UL55) (Britt et al. (1995) J. Infect. Dis. 171: 18-25); the hepatitis C virus (HCV) 
nucleocapsid protein in a secreted or a nonsecreted form, or as a fusion protein with the middle 
(pre-S2 and S) or major (S) surface antigens of hepatitis B virus (HBV) (Inchauspe et al 
(1997) DNA Cell Biol 16: 185-195; Major et al. (1995) /. Virol 69: 5798-5805); the hepatitis 

15 C virus antigens: the core protein (pC); El (pEl) and E2 (pOE2) alone or as fusion proteins 
(Saito et al. (1997) Gastroenterology 112: 1321-1330); the gene encoding respiratory syncytial 
vurus fusion protein (PEP-2) (Falsey and Walsh (1996) Vaccine 14: 1214-1218; Piedra et al. 

(1996) Pediatr. Infect. Dis. J. 15: 23-31); the VP6 and VP7 genes of rotaviruses (Choi et dl. 

(1997) Virology 232: 129-138; Jin etal. (1996) Arcft. Virol 141: 2057-2076); the El, E2, E3, 
20 E4, E5, E6 and E7 proteins of human papillomavirus (Brown et al, (1994) Virology 201: 46- 

54; DiUner et al. (1995) Cancer Detect. Prev. 19: 381-393; Kxul et al. (1996) Cancer Immunol 
Immanother. 43: 4448; Nakagawa et al. (1997) J. Irtfect. Dis. 175: 927-931); a human T- 
lymphotropic virus type I gag protein (Porter et al. (1995) J. Med. Virol 45: 469-474); Epstdn- 
Bair virus (EBV) gp340 (Ntockett et al (1996) J. Med, Virol 50: 263-271); the Epstein-Barr 

25 virus (EBV) latent membrane protein LMP2 (Lee et al. (1996) Eur. J. bnmvnol 26: 1875- 
1883); Epstein-Bair vims nuclear antigens 1 and 2 (Chen and Cooper (1996) J. Virol 70: 
4849-4853; Khanna etal. (1995) Virology 214: 633-637); the measles virus nucleoprotein (N) 
(Fooks et al. (1995) Virology 210: 456-465); and cytomegalovirus glycoprotein gB (Marshall 
et al. (1994) J. Med. Virol 43: 77-83) or glycoprotein gH (Rasmussen et dl. (1994) /. Irtfect. 

30 Dis. 170: 673-677). 

Parasites 

Antigens from parasites can also be modified according to the invention. These 
include, but are not limited to, the schistosome gut-associated antigens CAA (circulating 
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anodic antigen) and CCA (circulating cathodic antigen) in Schistosoma mansoni, S. 
haematobium or S. japonicum peelder et al. (1996) Parasitology 1 12: 21-35); a multiple 
antigen peptide (MAP) composed of two distinct protective antigens derived from tlie parasite 
Schistosoma mansoni (Femi et al. (1997) Parasite Itnmunol. 19: 1-11); Leishmania parasite 
5 surface molecules (Lezama-Davila (1997) Arch. Med. Res. 28: 47-53); third-stage larval (L3) 
antigens of L. loa (Akue et al. (1997) /. Inject. Dis. 175: 158-63); the genes, Tamsl-1 and 
Tamsl-2, encoding the 30-and 32-kDa major merozoite surface antigens of Theileria annulata 
(Ta) (d'OIiveira et al (1996) Gene 172: 33-39); Plasmodium faLcipanan merozoite surface 
antigen 1 or 2 (al-Yaman et al. (1995) Trans. R. Soc. Trop. Med. Hyg. 89: 555-559; Beck et al. 

10 (1997) /. Infect. Dis. 175: 921-926; Rzepczyk a/. (1997) Infect. Immun. 65: 1098-1100); 
circumsporozoite (OS) protein-based B -epitopes from P/armoifiw/i herghei, (PPPPNPND)2 
and Plasmodium yoelii, (QGPGAP)3QG, along with a P. berghei T-helper epitope 
KQIRDSIEEEWS (Reed et al. (1997) Vaccine 15: 482^88); NYVAC-Pf7 encoded 
Plasmodium falciparum antigens derived from the sporozoite (circumsporozoite protein and 

15 sporozoite surface protein 2), liver (liver stage antigen 1), blood (merozoite surface protein 1, 
serine repeat antigen, and apical membrane antigen 1), and sexual (25-lcDa sexual-stage 
andgen) stages of the parasite life cycle were inserted into a siagle NYVAC genome to 
generate NYVAC-Pf7 (Tine et al (1996) Infect. Immun. 64: 3833-3844); Plasmodium 
falciparum antigen Pfs230 (WilHamson etal. (1996) Mol Biochenu PamsitoL 78: 161-169); 

20 Plasmodium falciparum apical membrane antigen (AMA-1) (Lai et al (1996) Infect. Immun. 
64: 1054-1059); Plasmodiumfiilcipanan proteins Pfs28 and Pfs25 puffy and Kaslow (1997) 
Infect. Immun. 65: 1109-1113); Plasmodium fcdcipanm merozoite surface protein, MSPl (Hui 
etal. (1996) Infect. Immun. 64: 1502-1509); the malaria antigen Pf332 (Ahlborg etal. (1996) 
Immunology 88: 630-635); Plasmodiumfedciparum erythrocyte membrane protein 1 (Baruch et 

25 al. (1995) Proc. Nat'l. Acad. Sci. USA 93: 3497-3502; Baruch et al. (1995) Cell 82: 77-87); 
Plasmodimn falciparum merozoite surface antigen, PfMSP-1 (Egan et al. (1996) J. Infect. Dis. 
173: 765-769); Plasmodiumjulciparum antigens SERA, EBA-175, RAPl and RAP2 (Riley 
(1997) J. Pham. Pharmacol. 49: 21-27); Schistosoma j(^onicum paramyoshi (Sj97) or 
fragments tiiereof (Yaagetal (1995) Biochem. Biophys. Res. Commun. 212: 1029-1039); and 

30 Hsp70 in parasites (Maresca andKobayashi (1994) Experientia 50: 1067-1074). 

Allergen antigens 

Allergen antigens that can be modified according to the invention, include, but are not 
limited to those of animals, including the mite (e.g. , Dermatophagoides pteronyssinus. 
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Dennatophagoidesfarinae, Blomia tropicalis), such as the allergens der pi (Scobie et cd. 
(1994) Biochem. Soc. Trans. 22: 448S; Yssd et cd. (1992) J. Immunol. 148: 738-745), der p2 
(Chua et al (1996) Clin. Exp. Allergy 26: 829-837), der p3 (Smith and Thomas (1996) ain. 
Exp. Allergy 26: 571-579), der p5, der p V (Un et al. (1994) J. Allergy Clin. Immunol. 94: 989- 
5 996), derp6 (Bennett and Thomas (1996) ai«. Exp. Allergy 26: 1150-1154), derp 7 (Shen 
flZ. (1995) ain. Exp. Allergy 25: 416-422), der f2 (YuuM a/. (1997) Int. Arch. Allergy 
Immunol 1 12: 44-48), der f3 (Nishiyama et al. (1995) F£S5 Lett. 377: 62-66), der f7 (Shen et 
al. (1995) Clin. Exp. Allergy 25: 1000-1006); Mag 3 (Fujikawa etcd. (1996) MoZ. /mmHnoZ. 33: 
311-319). Also of interest as antigens are the house dust mite allergens Tyr p2 (Eriksson et al. 

10 (1998) Eur. J. Biochem. 251: 443^7), Lep dl (Schmidt etal. (1995) FEBS Lett. 370: 11-14), 
and glutathione S-transferase (O'Neill et al. (1995) Immunol Lett. 48: 103-107); the 25,589 Da, 
219 amino acid polypeptide with homology with glutathione S-transferases (OTSTeiU et al. 
(1994) Biochim. Biopliys. Acta. 1219: 521-528); Bio 1 5 (Airuda et al. (1995) Int. Arch. Allergy 
Immunol. 107: 456-457); bee venom phosphoUpase A2 (Carballido et al. (1994) J. Allergy 

15 CUn. Immunol. 93: 758-767; Jutel et al. (1995) /. Immunol. 154: 4187-4194); bovine 

dermal/dander antigens BDA 11 (Rautiainen oZ. (1995) J. Invest. Dermatol. 105: 660-663) 
and BpA20 (Mantyjarvi et al. (1996) J. Allergy Clin. Immunol. 97: 1297-1303); the major 
horse aUeigpn Equ cl (Giegoire et al. (1996) J. Biol Chem. 271: 32951-32959); Jumper ant M. 
pilosula allergen Myr p I and its homologous allergenic polypeptides Myr p2 (Donovan et al. 

20 (1996) Biochem. Mol Biol. Int. 39: 877-885); 1-13, 14, 16 kD allergens of the mite Blomia 
tropicalis (Caraballo et al. (1996) J. Allergy Clin. Immunol. 98: 573-579); the cockroach 
allergens Bla g Bd90K (Helm et al. (1996) J. Allergy CUn. Immunol. 98: 172-80) and Bla g 2 
(Airuda et al. (1995) J. Biol. Chem. 270: 19563-19568); the cockroach Cr-H aUergens (Wu et 
al. (1996) J. Biol Chem. 271: 17937-17943); fire ant venom aHergpn, Sd i 2 (Schmidt et al. 

25 (1996) /. Allergy Clin. Immunol 98: 82-88); the insect Chironomus Oaanmi major allergen Chi 
1 1-9 (Kipp et al (1996) Int. Arch. Allergy Immunol 1 10: 348-353); dog allisiBen Can f 1 or cat 
allergen Fed d 1 (Engram et al. (1995) J. Allergy Clin. Immunol 96: 449-456); albumin, 
derived, for example, from horse, dog or cat (Goubran Botros et al. (1996) Immunology 88: 
340-347); deer allergens with the molecular mass of 22 KD, 25 kD or 60 kD (Spitzauer et al 

30 (1997) Clin. Exp. Allergy 27: 196-200); and the 20 kd major allergen of cow (Ylonen et al. 
(1994) /. Allergy Clin. Immunol 93: 851-858). 

Pollen and grass allergens can also be modified according to the invention. Such 
allergens include, for example, Hor v9 (Astwood and HUl (1996) Gene 182: 53-62, lig vl 
(Batanero et al (1996) ain. Exp. Allergy 26: 1401-1410); Lol p 1 (MuUer etal. (1996) M. 
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Arch. Allergy Immunol. 109: 352-355), Lol p H (Tamborim et at. (1995) Mol Immunol. 32: 
505-513), Lol pVA, Lol pVB COng etal. (1995) Mol Immunol 32: 295-302), Lol p 9 (Blaher 
et al (1996) /. Allergy aim Immunol 98: 124-132); Par J I (Costa et al. (1994) FEES 
Lett. 341: 182-186; Sallusto et al (1996) /. Allergy Clin. Immunol 97: 627-637), Par j 2.0101 
5 (Dure et al (1996) FEBS Lett. 399: 295-298); Bet vl (Faber et al. (1996) J. Biol Chem. 271: 
19243-19250), Bet v2 (Rihs al (1994) /nr. Arc/i. ^?Zergy Immunol 105: 190-194); Dac g3 
(Guerin-Marchand et al. (1996) Mo/. Immunol 33: 797-806); Phi p 1 (Petersen et al (1995) J. 
AZ/ergy Clin. Immunol 95: 987-994), Phi p 5 (Muller a/. (1996) Jn?. Arch Allergy Immunol 
109: 352-355), Phi p 6 (Petersen et al (1995) Int. Arch. Allergy Immunol 108: 55-59); Cry j I 
10 (Sone et al (1994) Biochem. Biophys. Res. Commun. 199: 619-625), Cry j H (Namba et al. 

(1994) FEBS Lett. 353: 124-128); Cor a 1 (Schenk a?. (1994) Eur. J. Biochem. 224: 717- 
722); cyn dl (Smith et al. (1996) /. Allergy Oin. Immunol 98: 33 1-343), cyn d7 (Suphioglu et 
al (1997) FEBS Lett. 402: 167-172); Pha a 1 and isoforms of Pha a 5 (Suphioglu and Singh 

(1995) Clin. Exp. Allergy 25: 853-865); Cha o 1 (Suzuki et al (1996) Mol hnmmoU 33: 451- 
15 460); profilin derived, e.g, from timothy grass or biich pollen (Valenta et al. (1994) Biochem. 

Biophys. Res. Commun. 199: 106-118); P0149 (Wu a/. (1996) Plant Mol Biol 32: 1037- 
1042); Qry si (Xu et al. (1995) Gene 164: 255-259); and Amb a V and Amb 1 5 (Kim et al. 
0.996) Mol hnmunoL 33: 873-880; Zhu etcd. {1995) J. bnmunoL 155: 5064-5073). 

Food alleigens that can be modified according to the invention include, for example, 

20 profilin (Rihs et al. (1994) Int. Arch. Allergy Immunol 105: 190-194); rice allergenic cDNAs 
belonging to the alpha-amylase/trypsin inhibitor gene family (Alvarez et al. (1995) Biochim 
Biophys Acta 1251: 201-204); the main olive alleigen, Ole e I (Lombaidero et al. (1994) Clin 
Exp Allergy 24: 765-770); Sin a 1, the major allergen from mustard (Gonzalez De La Pena et 
al. (1996) Eur J Biochem. 237: 827-832); parvalbumin, the major allergm of sahnon 

25 (Lindstrom et al. (1996) Scand. J. Immunol 44: 335-344); apple allergens, such as the major 
allergen Mai d 1 (Vanek-Krebitz et al. (1995) Biochem. Biophys. Res. Commun. 214: 538- 
551); and peanut allergens, such as Ara h I (Burks et al (1995) /, Clin. Invest. 96: 1715-1721). 

Fungal allergens that can be modified according to the invention include, but are not 
limited to, the allergen, Cla h EI, of Cladosporium herbarum (Zhang et al (1995) /. Inmunol 

30 154: 710-717); (he allergen Psi c 2, a fungal cyclophilin, ftom the basidiomycete Psilocybe 
cubensis (Homer et al. (1995) Int. Arch. Allergy Immunol 107: 298-300); bsp 70 cloned from a 
cDNA library of Cladosporium herbarum (Zhang et al (1996) Clin Exp Allergy 26: 88-95); the 
68 kD allergen of Penicillium notatum (Shen et al (1995) Gin. E^qj. AUergy 26: 350-356); 
aldehyde dehydrogenase (ALDH) (Achatz et al (1995) Mol Immunol 32: 213-227); enolase 
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(Achatz et al. (1995) MoL Immunol. 32: 213-227); YCP4 (Id.); acidic ribosomal protein P2 

ad.). 

Other allergens that can be modijBed include latex allergens, such as a major allergen 
(Hev b 5) from natural rubber latex (Akasawa et al. (1996) J. Biol Cliem. 271: 25389-25393; 
5 Slater et al. (1996) J. Biol Chem. 271: 25394-25399). 

Antigens associated with autoimmune diseases and inflammatory conditions 

Autoantigens that can be modified according to the invention include, but are not 
limited to, myelin basic protein (Stinissen et al (1996) J. NeuroscL Res. 45: 500-511) or a 

10 fusion protein of myelin basic protein and proteolipid protein (Elliott et al. (1996) J. Clin. 
Invest. 98: 1602-1612), proteolipid protein (PLP) (Rosens: et cd. (1997) J. Neuroimmunol 75: 
28-34), 2',3 '-cyclic nucleotide 3 '-phosphodiesterase (CNPase) (Roseaer et al. (1997) J. 
Neuroimmunol. 75: 28-34), the Epstein Barr virus nuclear antigen-1 (EBNA-l) (Vaugjian et al 
(1996) /. Neuroinamatol 69: 95-102), HSF70 (Salvetti aZ. (1996) J. Neuroimmunol 65: 143- 

15 53; Feldmann et al. (1996) Cell 85: 307). 

Antigens that can be modified according to the invention and used to treat scleroderma, 
systenaic sclerosis, and systemic lupus erythematosus include, for example, (-2-GPI, 50 kDa 
glycoprotein (Blank et al. (1994) J. Autoimmun. 7: 441-455), Ku (p70/p80) autoantigen, or its 
80-kd subunit protein (Hong et al. (1994) Invest. Ophthalmol Vis. Sci. 35: 4023-4030; Wang et 

20 al (1994) /. Cell ScL 107: 3223-3233), the nuclear autoantigens La (SS-B) and Ro (SS-A) 
(Hnang et al. (1997) /. Clm. Immunol 17: 212-219; Igarashi et al. (1995) Autoim?mmity 22: 
33-42; Keech et al. (1996) CZin. Sip. Immunol 104: 255-263; ManoussaMs a?. (1995) /. 
Autoimmun. 8: 959-969; Topfer et dl. (1995) Proc. Nat'l Acad ScL USA 92: 875-879), 
proteasome (-type subunit C9 (Feist et al. (1996) /. JSJip. Med, 184: 1313-1318), Scleitjdeima 

25 antigens Rpp 30, Rpp 38 or Scl-70 (Eder et al. (1997) Proc. Nat'l Acad. Sci. USA 94: 1101- 
1106; Hietarinta et al. (1994) Br. J. Rheumatol 33: 323-326), the centrosome autoantigen 
PCM-1 OBao etal (1995) Autoimmunity 22: 219-228), polymyositis-scleroderma autoantigen 
(PM-Scl) (Kho et al. (1997) /. Biol Chem. 272: 13426-13431), scleroderma (and other 
systemic autoimmune disease) autoantigen CENP-A (Muro etal. (1996) Clin. Immunol 

30 Immunopathol 78: 86-89), U5, a small nuclear ribonucleoprotein (snRlSIP) (Okano et al. 
(1996) Clin. Immunol Immunopathol 81: 41-47), the 100-kd protein of PM-Scl autoantigen 
(Ge et al. (1996) Arthritis Rheum. 39: 1588-1595), the nucleolar U3- and Th(7-2) 
ribonucleoproteins (Verheijen et al. (1994) /. Immunol Methods 169: 173-182), the ribosomal 
protein L7 (NeuefoZ. (1995) Clin. Exp. Immunol 100: 198-204), hPopl Qj^gexaa et al. (1996) 
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EMBO J. 15: 5936-5948), and a 36-kd protein from nuclear matrix antigen (Deng et ed. G996) 
Arthritis Shewn. 39: 1300-1307). 

Antigens use&l in treatment of hepatic autoimmune disorders can also be modified; 
these include the cytochromes P450 and UDP-glucuronosyl-transferases (Obennayer-Straub 
5 and Manns (1996) Baillieres Clin. Gastroenterol. 10: 501-532), the cytochromes P450 2C9 and 
P450 1A2 (Bourdi etal. (1996) Ckem. Res. Toxicol 9: 1159-1166; Clemente etcd. (1997) /. 
Clin. Endocrinol. MetcUj. 82: 1353-1361), LC-1 antigen (Klein etcd. (1996) /. Pediatr. 
Gastroenterol. Nutr. 23: 461-465), and a 230-kDa Golgi-assodated protem (Funaki et al. 
(1996) Cell Struct. Funct. 21: 63-72). 

10 Antigens useful for treatment of autoimmune disorders of the skin that can be modified 

according to the invention include, but are not Uroited to, the 450 KD human epidermal 
autoantigen (Pujiwara et al. (1996) J. Invest Dermatol. 106: 1125-1130), the 230 IdD and 180 
kD bullous pemphigoid antigens (Hashimoto (1995) Keio J. Med. 44; 115-123; Murakami et 
al. (1996) J. Dermatol Set. 13: Wl-IYT), pemphigus foliaceus antigen (desmoglein 1), 

15 pemphigus vulgaris antigen (desmoglein 3), BPAg2, BPAgl, and type VH collagen (Batteux et 
al (1997) /. Clin. Immunol 17: 228-233; Hashimoto etcd. (1996) J. Dermatol Sen. 12: 10-17), 
a 168-kDa mucosal antigen in a subset of patients with cicatricial pemphigoid (Ghohestani et 
al. (1996) J. iTtvest. Dermatol 107: 136-139), and a 218-kd nuclear protein (218-kdA4i-2) 
(Seeligefii/. {1995) Arthritis Sheum. 38: 1389-1399). 

20 Antigens for treating insulin dependent diabetes mdlitus can also be modified; these, 

include, but are not limited to, insulin, proinsoUn, GAD65 and GAD67, heat-shock protein 65 
(hsp65), and islet-cell antigen 69 (ICA69) (Etench et cd. (1997) Diabetes 46: 34-39; Roep 
(1996) Diabetes 45: 1147-1156; Schloot etal. (1997) Diabetologia 40: 332-338), viral proteins 
homologous to GAD65 (Jones and Crosby (1996) Diabetologia 39: 1318-1324), islet ceS. 

25 antigen-related protein-tyiosine phosphatase (FTP) (Cui et al. (1996) /. Biol Chem. 271: 
24817-24823). GM2-1 ganglioside (Cavallo et al. (1996) X Endocrinol 150: 1 13-120; Dotta et 
al. (X996) Diabetes 45: 1193-1196), glutamic acid decarboxylase (GAD) (Nepom (1995) Curr. 
Opin. Immunol 7: 825-830; Panina-Bordignon etal. (1995) /. Exp. Med. 181: 1923-1927), an 
islet cell antigen ([CA69) (Karges etal. (1997) Biochim. Biopltys. Acta 1360: 97-101; Roep et 

30 al. (1996) Eur. J. Immunol 26: 1285-1289), Tep69, the single T cell epitope recognized by T 
cells from diabetes patients (Karges etal. (1997) Biochim. Biophys. Acta 1360: 97-101), ICA . 
512, an autoantigen of type I diabetes (Solimena et al (1996) EMBO X 15: 2102-2114), an 
islet-ceU protein tyrosine phosphatase and the 37-kDa autoantigen derived from it in type 1 
diabetes (including IA-2, IA-2) (La Gasse et al. (1997) Mol Med. 3: 163-173), the 64 kDa 
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protein from In- 1 1 1 cells or human thyroid follicular cells that is inununopredpitated with sera 
from patients with islet cell surface antibodies (ICSA) (Igawa et al. (1996) Endocr. J. 43: 299- 
306), phogrin, a homologue of the human transmembrane protein tyrosine phosphatase, an 
autoantigen of type 1 diabetes (KawasaM et al. (1996) Biochem. Biophys. Res. Commun. 227: 
5 440-447), the 40 kDa and 37 KDa tryptic fragments and their precursors IA-2 and lA-2 in 
IDDM (Lampasona etal. (1996) J. Immunol. 157: 2707-2711; Notions et al. (1996) J. 
Autoimmun. 9: 677-682), insulin or a cholera toxoid-insulin polypeptide (Bergerot et al. (1997) 
Proc. Nat'l. Acad. Sci. USA 94: 4610-4614), carboxypeptidase H, the human homologue of 
gp330, which is a renal epithelial glycoprotein involved in inducing Heymann nephritis in rats, 

10 and the 38-kD islet mitochondrial autoantigen (Arden etal. (1996) J. ain. Invest. 97: 551-561. 

Useful antigens for rheumatoid arthritis treatment that can be modified according to the 
invention include, but are not limited to, the 45 kDa DEK nuclear antigen, in particular onset 
juvenile rheumatoid arthritis and iridocyclitis (Murray etal (1997) J. Rheumatol. 24: 560- 
567), human cartilage glycoprotein-39, an autoantigen in rheumatoid arthritis (Verheijden et al. 

15 (1997) Arthritis Rheum. 40: 1 1 15-1125), a 68k autoantigen in rheumatoid arthritis (Blass et al. 
(1997) Ann. Rheum. Dis. 56: 317-322), collagen (Rosloniec etal. (1995) J. Immunol 155: 
4504-451 1), collagen type n (Cook et al (1996) Arthritis Rheum. 39: 1720-1727; Tienfham 
(1996) Arm. N, Y. Acad. Sci. IIB: 306-314), cartQage link protein (Guerassimov et al. (1997) J. 
Rheumatol. 24: 959-964), ezrin, radixin and moesin, which are auto-immune antigens in 

20 rheumatoid arthritis (Wagatsuma et al. (1996) MoL Immunol 33: 1171-1176), and 
mycobacterial heat shock protein 65 (Ragno et al (1997) Arthritis Rheum. 40: 277-283). 

Atttigeos useful for treatment are autoimmune thyroid disonjers tha.t can be modified 
include, for example, thyroid peroxidase and the thyroid stimulating hormone receptor (Tandon 
and Weetman (1994) J. R. Coll Physicians Land. 28: 10-18), thyroid peroxidase from human 

25 Graves' thyroid tissue (Gardas et al. (1997) Biochem. Biophys. Res. Commun. 234: 366-370; 
23mmer ettd. (1997) Histochem. Cell Biol 107: 115-120), a 64-kDa antigen associated with 
tiiyroid-assodated ophthalmopathy (Zhang et al. (1996) Cli?i. Immunol bmrnmopaihol 80: 
236-244), the human TSH receptor (Nicholson c/ a/. (1996)/. Mol Endocrinol 16: 159-170), 
and the 64 KDa protein from In-11 1 cells or human thyroid follicular cells that is 

30 immunoprecipitated with sera from patients with islet cell surface antibodies (ICSA) (Igawa et 
al. (1996) Endocr. J. 43: 299-306). 

Other associated antigens that can be modified include, but are not Umited to, Sjogren's 
syndrome (-fodrin; Haneji et al. (1997) Science 276: 604-607), myastenia gravis (the human 
M2 acetylcholine receptor or fragments thereof, specifically the second extracellular loop of 
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the human M2 acetylcholine receptor; Fu etal. (1996) Clin. Immunol. Immwiopathol. 78: 203- 
207), vitiligo (tyrosinase; Fishman et al (1997) Cmcer 79: 1461-1464). a 450 kD human 
epidermal autoantigen recognized by serum fiom individual with blistering skin disease, and 
ulcerative colitis (chromosomal protems HMGl and HMG2; Sobajima et al. (1997) Clin. Exp. 
5 Immunol. 107: 135-140). 

Sperm Antigens 

Sperm antigens which can be used in the genetic vaccines include, for example, lactate 
dehydrogenase CLDH-C4), galactosyltransferase (GT), SP-10, rabbit sperm autoantigen (RSA), 

10 guinea pig (g)PH-20, cleavage signal protein (CS-1), HSA-63, human (h)PH-20, iand AgKA 
(Zhu and Naz (1994) Arch. Androl. 33: 141-144), the synthetic sperm peptide, PlOG (OTRand 
et al. (1993) /. Reprod. Immunol 25: 89-102), the 135kD, 95kD, 65kD, 47H), 41kD and 23kD 
proteins of sperm, and the FA-1 antigen (Naz et al (1995) Arch. Androl 35: 225-231), and the 
35 kD fragment of cytokeratin 1 (Lucas etal (1996) Anticancer Res. 16: 2493-2496). 

15 Also, examples of antigens are set forth in Punnonen et al. (1999) WO 99/41369; Punnonen et 
al. (1999) WO 99/41383; Punnonen et aL (1999) WO 99/41368; and Punnonen et al. (1999) 
WO 99/41402), the contents of all of which are incorporated herein by reference in theu: 
entirety for all purposes. Other useful antigens have been described in the literature or can be 
discovered using genomics approaches. 

20 

P^tide addition 

Jo. principle the peptide addition X can be any stretch of amino acid residues ranging from a 
single amino add residue to a large protein, e.g. a mature protein. Usually, the peptide addition 
X comprises 1-500 amino add residues, such as 2-500, normally 2-50 or 3-50 amino add 

25 residues, such as 3-20 amino acid residues. Hie length of the peptide addition to be used for 
modification of a givm polypeptide is dependent of or determined on the basis of a number of 
factors including the type of polypeptide of interest and the desired effect to be achieved by the 
modification. Normally, the peptide addition has less than 90% identity to the amino add 
sequence of a native fall length polypeptide, in particular less than 80% identity, such as less 

30 than 70% identity or even lower degree of identity to afuU length protein, in one embodiment 
the peptide addition may constitute a part of a fuU length protein (e.g. 1-50 amino acid residues 
thereof. 

The peptide addition may be designed by a site-specific or random approach, e,g as out- 
lined in further detail in the Methods section below. Thus section also comprises a set of 
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guidelines useful for preparing a peptide addition for use in tiie present invention are described. 
It will be understood that those guidelines are iutended for illustration purposes only and that a 
person skilled in the art will be aware of alternative useful routes for design of peptide 
addition. Thus, the method of designing a peptide addition for use herein should not be 
5 considered limited to that described in the Materials section. 

The number of glycosylation sites should be sufficient to provide the desired effect 
Typically, the peptide addition X comprises 1-20, such as 1-10 glycosylation sites. For 
instance, the peptide addition X comprises 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 glycosylation sites. It is 
well known that one frequentiy occurring consequence of modifying an amino acid sequence 

10 of, e.g., a human protein is that new epitopes are created fay such modification. In order to 
shield any new epitopes created by the peptide addition, it is desirable that sufficient 
glycosylation sites are present to enable shielding of all epitopes introduced into the sequence. 
This is e.g, achieved when the peptide addition X comprises at least one glycosylation site 
within a stretch of 30 contiguous amino add residues, such as at least one ^ycosylation site 

15 within 20 amino acid residues or at least one glycosylation site within 10 amino acid residues, 
in particular 1-3 glycosylation sites within a stretch of 10 contiguous amino acid residues in the 
peptide addition X. 

Thus, in one embodiment the peptide addition X comprises at least two glycosylation 
sites, wherein two of said sites axe separated by at most 10 amino acid residues, none of which 

20 comprises a glycosylation site. Furthermoie, the polypeptide Pp can comprise at least one 
introduced glycosylation site, in particular 1-5 introduced glycosylation sites. Analogously, the 
polypq)tide Pp can comprise at least one removed ^ycosylation site, in particular 1-5 removed 
gilycosylation sites. 

The glycosylation site of the peptide addition may be an m vivo or in vitro 

25 glycosylation site. Prefererably, the glycos j^tion site is an in vivo glycosjdation site, in 
particular an N-glycosylation site since glycosylation of such site is more easy to control than 
to an O-glycosylation site. Accordingly, in a preferred embodiment the peptide addition X 
comprises at least one N-giycosylation site, typicdly at least'two N-^ycosylation sites. For 
instance, the pqrtide addition X has tiie structure Xi-N-X2-rr/S]/C-Z, wherdn Xi is a peptide 

30 comprising at least one amino acid residue or is absent, Xz is any amino acid residue different 
from Pro, and Z is absent or a peptide comprising at least one amino acid residue. For instance, 
Xi is absent, X2 is an amino add residue selected j&om the group consisting of I, A, G, V and S 
(an relativdy small amino add residues), and Z comprises at least 1 amino add residue. 
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For instance, Z can be a peptide comprising 1-50 amino add residues and, e.g., 1-10 
glycosylation sites. 

In another polypeptide of the invention Xi comprises at least one amino acid residue, 
e.g. 1-50 amino acid residues, X2 is an amino acid residue selected from the group consisting 
5 of I, A, G, V and S, and Z is absent. For instance, Xi comprises 1-10 glycosylation sites. 
For instance, the peptide addition for use in the present invention can comprise a peptide 
sequence sdected tcom the group consisting of 1NA[T/S], GM[T/S], VNI[T/S], SNI[T/S], 
ASNI[T/S], NI[T/S], SPINA[T/S], ASPINA[T/S], ANirr/S]ANI[T/S]ANI, 
ANI[T/S]GSNI[T/S]GSNI[T/S], FNI[T/S]VNI[T/S]V 

10 YM[T/S]VNI[TyS]V, AFNI[T/S]VNI[T/S]V, AYlSra{T/S]VNI[T/S]V, APNDET/S]VM[T/S]V, 
ANirr/S], ASNS[T/S]NNG[T/S]LNA[T/S], ANH[T/S]NE[T/S]NA[T/S], GSPE^A[T/S], 
ASPINA[T/S]SP]NA[T/S], ANN[T/S]NY[T/S]NW[T/S], ATISri[T/S]LNY[T/S]AN[T/S]T, 
AANS[TyS]GNI[T/S]ING[T/S], AVNW[T/S]SND[T/S]SNS[T/S], GNA[T/S], 
AVNW[T/S]SND[T/S]SNS[T/S], ANN[T/S3NY[T/S]NS[T/S], ANNrNYTNWT, 

15 AM[T/S] VNI[T/S] V, ]Sa?[T/S]VNF[T/S] and NIIT/S]VN[rr/S] V wherein [T/S] is either a T 
or an S residue, preferably a T residue. Other non-limiting examples include a peptide addition 
comprising the sequence NSTQNATA, which conesponds to positions 231 to 238 of the 
human calcium activated channel 2 ptecurscr (to add two N-glycosylation sites), or the 
sequence A^ILTVR^^J^lNVTV, which conesponds to positions 538 to 551 of the human G 

20 protein coupled receptor 64 (to add three N-glycosylation sites). 

The peptide addition can corc^se one or more of these peptide sequences, i.e. at least 
two of said sequences either directly linked together or separated by one or mote amino acid 
residues, or can contain two or more copies of any of these peptide sequence. It wiU be 
understood that the above specific sequences are given for illustrative purposes and thus do not 

25 constitute an exclusive list of peptide sequences of use in the present invention. 

In a more specific embodiment the peptide addition X is selected from the group 
consisting of INA[T/S], GNITT/S], YN1[T/S], SN[[T/S], ASNIjT/S], NI[T/S], SPINA[T/S], 
ASPINA[T/S], ANI|T/S3ANIIT/S]ANI, and Ararr/S]GSNI[T/S]GSNI[T/S], wherein [T/S] is 
eitiier a T or an S residue, preferably a T residue. 

30 As stated further above the polypeptide Pp can be a native polypeptide that may or may 

not comprise one or more glycosylation sites, la order to further modify the glycosylation of 
the polypeptide Pp of interest (in terms of the number of oligosaccharide moieties attached to 
the polypeptide), the polypeptide Pp can be a variant of a native polypeptide that differs from 
said polypeptide in at least one introduced or at least one removed glycosylation site. 
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For instance, the polypeptide Pp comprises at least one introduced glycosylation site, in 
particular 1-5 introduced glycosylation sites, such as 2-5 introduced glycosylation sites. 

In order to affect the total glycosylation of the polypeptide of interest the glycosylation 
site is introduced so that the N residue of said glycosylation site is exposed at the surface of the 
5 polypeptide, when folded in its active form. Likewise, a glycosylation site to be removed is 
selected from those having an N residue exposed at the surface of the polypeptide. 

In one embodiment, the peptide addition X has an N residue in position -2 or -1, and 
the polypeptide Pp or Px has a T or an S residue in position +1 or +2, respectively, the residue 
numbering being made relative to the N-tenninal amino acid residue of Pp or Px, whereby an 
10 N-glycosylation site is formed. 

Glycosylation 

The polypeptide of the invention is glycosylated (i.e. comprises an in vivo attached N- or O- 
linked oligosaccharide moiety or in vitro attached oUgosaccharide moiety) and furthermore has 

15 an altered glycosylation profile as compared to that of the polypeptide Pp. For instance, the 
altered glycosylation profile is a consequence of an altered, normally increased, number of 
attached oligosaccharide moieties and/or an altered type or distribution of attached 
oligosaccharide moidties. 

Furthermore, for polypeptides intended for therapeutic or veterinary uses cc to which a 

20 human or animal is otherwise exposed, the type of oligosaccharide moiety to be attached 
should normally be cue that does not lead to increased inmiunogenicity of the polypeptide as 
compared to that of the polypeptide Pp. The coupling of an oligosaccharide moiety may take 
place in vivo or in vitro. In order to achieve in vivo glycosylation of a a nucleotide sequence 
encoding the polypeptide should be inserted in a glycosylating, eucaiyotic expression host. The 

25 expression host cell may be selected from fungal (fUamentous fungal or yeast), insect, 
mammalian cells or transgenic plant cells as disclosed in further detail in the section entitled 
"Methods of preparing a polypeptide otthe invention" . Also, the glycosylation may be 
achieved m the human body when using a nucleotide sequence encoding the polypeptide of the 
invention in gene therapy. 

30 In vitro glycosylation can be achieved by attaching chemically synthesized 

' oligosaccharide structures to the polypeptide using a variety of different chemistries e.g. the 
chemistries employed for attachment of PEG to proteins, wherein the oligosaccharide is linked 
to a functional group, optionally via a short spacer (see the section entitled Conjugation to a 
Non-Oligosaccharide Macromolecular Moiety). The in vitro glycosylation can be carried out 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



35 

in a suitable buffer at pH 4-7 in protein concentrations of O.S-2 mg/ml and a volume of 0.02-2 
ml. The activated naannose compound is present in 2-200 fold molar excess, and reactions are 
incubated at 4~25°C for periods of 0. 1-3 hours. In vitro glycosylated GCB polypeptides are 
purified by dialysis and standard chromatographic techniques. 
5 Other in vitro gjyoosylation methods are described, for example in WO 87/05330, by 

Aplin etl al., CRC Crit Rev. Biochem., pp. 259-306, 1981, by Lundblad andNoyes, Chemical 
Ragents for Protdn Modification, CRC Press Lie. Boca Raton, H, by Yan and Wold, 
Biochemistry, 1984, Jul. 31: 23(16): 3759-65, and by Doebber et al., X Biol. Chem., 257, 
pp2193-2199, 1982. 

10 Furthermore, in vitro glycosylation to protein- and peptide-bound Gin-residues can be 

carried out by transglutaminases (TGases). Transglutaminases catalyse the transfer of donor 
amine-groups to protein- and peptide-bound Gin-residues in a so-called cross-linking reaction. 
The donor-amine groups can be protein- or peptide-bound e.g. as the s-amino-group in Lys- 
residues or it can be part of a small or large organic molecule. An example of a smaU organic 

15 molecule functioning as amino-donor in TGase-catalysed cross-linking is putrescine (1,4- 
diaminobutane). An example of a larger organic molecule ftmctioning as amino-donor in 
TGase-catalysed cross-linking is an amine-containing PEG (Sato et al.. Biochemistry 35, 1996, 
13072-13080). 

TGases, in general, are highly specific enzymes, and not every Gin-residues exposed on 
20 the surface of a protein is accessible to TGase-catalysed cross-linking to amino-containing 
substances. la arder to render a protem susceptible to TGase-catalysed cross-linking reactions 
stretches of amino add sequence known to function very well as TGase substrates ace inserted 
at convenient positions in the amino acid sequence encoding a GCB polypeptide. Several 
amino acid sequences are known to be or to contain excellent natural TGase substrates e.g. 
25 substance P, elafin, fibrinogen, fibronectin, ocrplasmin inhibitor, ot<;asems, and P-caseins and 
may thus be inserted into and thereby constitute part of the amino acid sequence of a 
polypeptide of the invention. 

The nature and number of oligosaccharide moieties of a glycosylated polypeptide of the 
invention may be determined by a number of different methods known in the art e.g.by lectin 
30 binding studies (Reddy et al., 1985, Biochem. Med. 33: 200-210; Cummings, 1994, Meth. 
Enzymol. 230: 66-86; Protein Protocols (Walker ed.), 1998, chapter 9); by reagent array 
analysis method (RAAM) sequencing of released oligosaccharides (Edge et al,, 1992, Proc. 
Natl. Acad. Sd. USA 89: 6338-6342; Prime et al., 1996, J. Chrom. A 720: 263-274); by 
RAAM sequencing of released oligosaccharides in combination with mass spectrometry 
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(Klausen, et al., 1998, Molecular Biotechnology 9: 195-204); or by combining proteolytic 
degradation, glycopeptide purification by HPLC, exoglycosidase degradations and mass 
spectrometry (Krogh et al, 1997, Eur. J. Biochem. 244: 334-342). Specific methods for 
detennining the glycosylation profile is described in the examples section hereinafter. 
5 Normally, the glycosylated polypeptide of the invention comprises 1-15 oligosaccharide 
moieties, such as 1-10 or 1-6 oligosachharide moieties. Usually, at least one of these is 
attached to the peptide addition and further oligosaod^dB structures are attached to the 
peptide addition or the polypeptide Pp. 

10 Polypeptide of the invention conjugated to a second non-peptide moiety 

It can be advantageous that the glycosylated polypeptide of the invention further comprises at 
least one second non-peptide moiety. The teim "second non-peptide moeity" is intended to 
indicate a non-peptide moiety different from an oligosaccharide moiety, e.g. a polymer 

15 molecule, a lipophilic compound and an organic derivatizing agent. 

For this puipose the polypeptide must comprise at least one attachment group for the 
second non-peptide moiety. The attachment group can be one present on an amino acid residue, 
e.g., selected from the group consisting of the N-terminal or C-tenninal amino acid residue of 
the polypeptide of the invention, lysine, cysteine, arginine, glutamine, aspartic acid, glutamic 

20 acid, serine, tyrosine, histidine, phenylalanine and tryptophan, or on an oligosaccharide moiety 
attached to the polypeptide. For instance, the attachment group for the non-peptide moiety is an 
epsilon-amino group. 

It will be understood that an attachment group for the second non-peptide moiety may 
be provided by the N-terminal peptide addition, within the polypeptide Pp. and/or as a C- 

25 terminal peptide addition (having similar properties to those described above for the peptide 
addition X). In one embodiment, the peptide addition X comprising or contributing to an 
attachment site further comprises an attachment group for a second non-peptide moeity. Rjr 
instance, the peptide addition may comprise 1-20, such as 1-10 attachment groups for a second 
non-peptide moiety. Such attachment groups may be distributed in a similar manner as that 

30 described immediately above for glycosylation sites. Also, the peptide addition X can comprise 
at least two attachment groups for the second non-peptide moiety. 

Also, the polypeptide Pp can be a variant of a native polypeptide, which as compared to 
said native polypeptide, comprises at least one introduced and/or at least one removed 
attachment group for the second non-peptide moiety. For instance, the polypeptide Pp 
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comprises at least one introduced attachment group, in particular 1-5 introduced attachment 
groups, such as 2-5 introduced attachment groups. 

The attachment group is preferably located in a position that is exposed at the surface of 
the folded protein and thus accessible for conjugation to the polymer molecule. For instance, 
5 attachment to one or more polymer molecules increases the molecular weight of the 
polypeptide and can further serve to shield one or more epitopes thereof. The polymer 
molecule may be any of the molecules mentioned in the section entitled "Conjugation to a 
polymer molecule", but is preferably selected from the group consisting of linear or branched 
polyethylene glycol or polyalkylene oxide. Most preferably, the polymer molecule is mPEG- 

10 SPA,.mPEG-SCM, mPEG-BTC from Shearwater Polymers, Ihc, SC-PEG fromEnzon, Inc., 
tresylated mPEG (US 5,880,255) or oxycarbonyl-cxy-N-dicarboxyimide PEG (US 5,122,614) 
(and the relevant attachment group is one present on a lysine or N-terminal residue). 
Alternatively, the polymer molecule is an activated PEG molecule reactive with a cysteine 
residue, e.g. VS-PEG fitom Shearwater Polymers. 

15 Especially, when the polypeptide Pp is an- industrial enzyme, the second non-peptide 

moiety may be one which is capable of cross-linking and thereby of being immobilized on a 
suitable solid support. Such cross-linking polymers are available from Shearwater Polymers, 
Inc. It wUl be understood that the peptide addition of the polypeptide according to this 
embodiment comprises an attachment group fiar ttie cross-linking polymer in question. In 

20 connection wdth this embodiment the polypeptide Pp is preferably an amyloglucosidase, an 
alpha-amylase, a glucose isomerase, an amidase, or a lipolytic enzyme. 

Jr the following sections "Conjugation to a lipophilic compound", "Conjugation to a 
polymer molecule", and '*Conjugation to an organic derivatizing agent" conjugation to specific 
types of non-peptide moieties is described. 

25 It will be understood that a conjugation step of any metiiod of the invention only finds 

relevance when a non-polypeptide moiety other ttian an in vivo attached oligosaccharide 
moiety is to be conjugated to tiie polypeptide, since in vivo glycosylation takes place during the 
expression step when using an appropriate glycosylating host cell as expression host. 
Accordingly, wheneva: a conjugation step occurs in the present invention this is intended to be 

30 conjugation to a uon-polypeptide moiety other than an oligosaccharide moiety attached by in 
vivo glycosylation during expression in a glycosylating organism, /n vitro glycosylation 
methods are described in the section entifled "glycosylation". 

Conjugation to a lipophilic compound 
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The polypeptide and the lipophilic compound can be conjugated to each other, either directly 
or by use of a linker. The lipophilic compound can be a natural compound such as a saturated 
or unsaturated fatty acid, a fatty add diketone, a terpene, a prostaglandin, a vitamine, a 
caiotenoide or steroide, or a synthetic compound such as a carbon add, an alcohol, an amine 
5 and sulphonic acid with one or more alkyl-, aryl-, alkenyl- or other multiple unsaturated 
compounds. Furthermore, the lipophilic compound may be any of the lipophilic substituents 
disclosed in WO 97/31022, the contents of which are incorporated herein by reference. The 
conjugation between the polypeptide and the lipophilic compound, optionally through a linker 
can be done according to methods known in the art, e.g. as described by Bodanszky in Peptide 
10 Synthesis, John Wiley, New York, 1976 and in WO 96/12505 and further as described in WO 
97/31022. 

Conjugation to a polymer molecule 

The polymer molecule to be coupled to the polypeptide of the invention can be any suitable 

15 polymer molecule, such as a natural or synthetic homo-polymer or heteropolymer, typically 
with a molecular weight in the range of 300-100,000 Da, such as 300-20,000 Da, more 
preferably in the range of 500-10,000 Da, even more preferably in the range of 500-5000 Da 
Examples of homo-polymers include a polyol (i.e. poly-OH), a polyamine (i,e. poly- 
NBj) and a polycarboxylic acid (i.e. poly-COOH). A hetero-polymer is a polymer that 

20 comprises different coupling groups, such as a hydroxyl group and an amine group. 

Examples of suitable polymer molecules include polymer molecules selected from the 
group consisting of polyalkylene oxide CPAO), induding polyalkylene glycol CPAG), such as 
polyethylene glycol (PEG) and polypropylene glycol (PPG), branched PBGs, poly-vinyl 
alcohol (PVA), poly-caiboxylate, poly^(vinylpyrolidone), polyethylene-co-maleic acid 

25 anhydride, polystyrene-co-malic add anhydride, dextran, including carboxymethyl-dextran, or 
any other biopolymer suitable for the intended purpose, such as for reducing immunogenicity 
and/or increasing functional in vivo half-life and/or soinn half-life, or for providing 
inomobilization properties to the polypeptide (as discussed in the section entitied 'Tolypeptide 
of mterest". Another example of a polymer molecule is human albumin or another abundant 

30 plasma protdn. GeneraJly, polyalkylene glycol-derived polymers are biocompatible, non-toxic, 
non-antigenic, non-immunogenic, have various water solubility properties, and are easily 
excreted fix>m living organisms. 

PEG is the preferred polymer molecule for reducing immunogenicity, allergenicity 
and/or increasing half-life, since it has only few reactive groups capable of cross-linking 
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compared, e.g., to polysaccharides such as dextran, and the like. In particular, monofunctional 
PEG, e.g. methoixypolyethylene glycol (mPEG), is of interest since its coupUng chemistry is 
relatively simple (only one reactive group is available for conjugating with attachment groups 
on the polypeptide). Consequently, the risk of cross-linking is eliminated, the resulting 
5 polypeptide conjugates are more homogeneous and the reaction of the polymer molecules with 
the polypeptide is easier to control. 

To effect covalent attachment of the polymer moIecule(s) to the polypeptide, the 
hydroxyl end groups of the polymer molecule must be provided in activated form, i.e. with 
reactive functional groups. Suitable activated polymer molecules are commercially available, 

10 e.g. from Shearwater Polymers, Inc., Huntsville, AL, USA. Alternatively, the polymer 

molecules can be activated by conventional methods known in the art, e.g. as disclosed in WO 
90/13540. Specific examples of activated linear or branched polymer molecules for use in the 
present invention are described in the Shearwater Polymers, Inc. 1997 and 200O Catalogs 
(Functionalized Biocompatible Polymers for Research and pharmaceuticals, Polyethylene 

15 Glycol and Derivatives, incorporated herein by reference). Specific examples of activated PEG 
polymers include die following Knear PEGs: fJHS-PEG (e.g. SPA-PEG, SSPA-FEG, SBA- 
PEG, SS-PEG, SSA-PEQ, SC-PEG, SG-PEG, and SCM-PEG), and NOR-PEG), BTC-PEG, 
EPOX-PEG, NCO-PEG, NPC-PBG, GDI-PEG, ALD-PEG, TRES-PEG, VS-PEG, lODO-PEG, 
andMALrPEG, and branched PEGs such as PBG2-NHS and those disclosed in US 5,932,462 

20 and US 5,643,575, both of which are incorporated herein by reference. Furthennore, the 
following publications, incorporated herein by reference, disclose useiftd polymer molecules 
and/or PEGylation chemistries: US 5,824,778, US 5,476,653, WO 97/32607, EP 229,108, EP 
402,378, US 4.902302, US 5,281,698. US 5,122,614, US 5,219,564, WO 92/16555, WO 
94y04193, WO 94/14758, WO 94/17039, WO 94/18247, WO 94/28024, WO 95/00162, WO 

25 95/1 1924, WO95/13090, WO 95/33490, WO 96/00080, WO 97/18832, WO 98/41562, WO 
98/48837. WO 99/32134, WO 99/32139, WO 99/32140, WO 96/40791, WO 98/32466, WO 
95/06058, EP 439 508, WO 97/03106, WO 96/21469, WO 95/13312, EP 921 131, US 
5,736,625, WO 98/05363, EP 809 996, US 5,629,384, WO 96/41813, WO 96/07670, US 
5,473,034, US 5,516,673, EP 605 963, US 5,382,657, EP 510 356, EP 400 472, EP 183 503 

30 and EP 154 316. 

The conjugation of the polypeptide and the activated polymer molecules is conducted 
by use of any conventional method, e.g. as described in the foUowing references (which also 
describe suitable methods for activation of polymer molecules): R.F. Taylor, (1991), "Protem 
immobilisation. Fundamental and applications". Marcel Dekker, N.Y.; S.S. Wong, (1992), 
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"Chemistry of Protein Conjugation and Crosslinking", CRC Press, Boca Raton; G.T. 
Hermanson et al., (1993), "Inunobilized Affinity Dgand Techniques", Academic Press, N,Y.). 
The skilled person will be aware that the activation method and/or conjugation chemistry to be 
used depends on the attachment group(s) of the polypeptide (examples of which are given 
5 further above), as well as the functional groups of the polymer (e.g. being amine, hydroxyl, 
carboxyl, aldehyde, sulfydryl, succinimidyl, maleimide, vinysulfone or haloacetate). The 
PEGylation can be directed towards conjugation to all available attachment groups on the ^ 
polypeptide (i.e. such attachment groups that are exposed at the surface of the polypeptide) or 
can be directed towards one or more specific attachment groups, e.g. the N-teiminal amin o 

10 group (US 5,985,265). Furthennore, the conjugation can be achieved in one step orin a 
stepwise manner (e.g. as described in WO 99/55377). 

It will be understood that the PEGylation is designed so as to produce the optimal 
mdecule with respect to the number of PEG molecules attached, the size and form of such 
molecules (e.g. whether they are linear or branched), and where in the polypeptide such 

15 molecules are attached. Fbr instance, the molecular weight of the polymer to be used can be 
chosen on the basis of the desired effect to be achieved For instance, if the primary purpose of 
the conjugation is to achieve a polypeptide having a high molecular weight (e.g. to reduce lenal 
clearance) it is usually desirable to conjugate as few high Mw polymer molecules as possible to 
obtain the desired molecular weight. When a high degree of epitope shielding is desirable this 

20 can be obtained by use of a sufficiently high number of low molecular weight polymer 
molecules (e.g. with a molecular weight of about 5,000 Da) to effectively shield aU or most 
epitopes of the polypeptide. For instance, 2-8, such as 3-6 such polymers can be used. 

In connection with conjugation to only a single attachment group on the protdn (as 
described in US 5,985,265), it can be advantageous that the polymer molecule, which can be 

25 linear or branched, has a high molecular weight, e.g. about 20 kDa. 

Nonnally, the polymer conjugation is performed under conditions aiming at reacting all 
available polymer attachment groiq)s with polymer molecules. Typically, the molar ratio of 
activated polymer molecules to polypeptide is up to about 1000-1, in particular 200-1, 
preferably 100-1, such as 10-1 or 5-1, but also equimolar ratios can be used in order to obtain 

30 Optimal reaction. 

It is also contemplated according to the invention to couple the polymer molecules to 
the polypeptide through a linker. Suitable linkers are well known to the skilled person. A 
preferred example is cyanuric chloride (AbuchowsM et al., (1977), J. Biol. Chem., 252, 
3578-3581; US 4,179,337; Shafea: et al., (1986), J, Polym. Sci. Polym. Chem. Ed., 24, 375-378. 
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Subsequent to the conjugation residual activated polymer molecules are blocked 
according to methods Jmown in the art, e.g. by addition of primary amine to the reaction 
mixture, and the resulting inactivated polymer molecules are removed by a suitable method. 

la a specific embodiment, the polypeptide of the invention is one that comprises one or 
5 more PEG molecules attached to the peptide addition, but not to the polypeptide P. For 
instance, the PEG molecule is attached to one or more cysteine residues present in the peptide 
addition X and, if necessary, one or more cysteine residues have been removed from the 
polypeptide P of interest in order to avoid conjugation thereto. 

In another specific embodiment, the polypeptide of the invention comprises at least one 
10 PEG molecule attached to a lysine residue of the peptide addition X, in particular a linear or 
branched PEG molecule with a molecular weight of at least 5kDa. 

Methods of preparing a polypeptide of the invention 

The invention further comprises a method of producing the polypeptide of the invention, which 

15 method comprises culturing a host cell transformed or transfected with a nucleotide sequence 
encoding the polypeptide under conditions permitting the expression of the polypeptide, and 
recovering the polypeptide from the culture. 

Apart ftom recombinant production, polypeptides of the invention may be produced, 
albeit less efficiently, by chemical synthesis or a combination of chemical synthesis and 

20 recombinant DNA technology. 

The nucleotide sequence of the invention encoding a polypeptide of the invention may 
be constructed by isolating or synthesizing a nucleotide sequence encoding the parent 
polypeptide and fusing a nucleotide sequence encoding the relevant peptide addition in 
accordance with established technologies. To the extent amino acid modifications are to be 

25 made in the parent polypeptide, these are conveniently done by mutagenesis, e.g. using site- 
directed mutagenesis in accordance with well-known methods, e.g. as described in Nelson and 
Long, Analytical Biochemistry 180, 147-151, 1989, random mutagenesis or shuffling. 

The nucleotide sequence may be prepared by chemical synthesis, e.g. by using an 
oligonucleotide synthesizer, wherein oligonucleotides are designed based on the amino acid 

30 sequence of the desired polypeptide, and preferably selecting those codons that are favoured in 
the host cell in which the recombinant polypeptide wiH be produced. For example, several 
small oligonucleotides coding for portions of the desired polypeptide may be synthesized and 
assembled by polymerase chain reaction (PGR), ligation or ligation chain reaction (LCR). The 
individual oligonucleotides typically contain 5' or 3' overhangs for conq)lementary assembly. 
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Once assembled (by synthesis, sitB-diiected mutagenesis or another method), the 
nucleotide sequence encoding the polypeptide may be inserted into a recomhinant vector and 
operably linked to control sequences necessary for expression of thereof in the desired 
transformed host cell. 

5 It should of course be understood that not all vectors and expression control sequences 

function equally well to express the nucleotide sequence encoding the polypeptide part of the 
invention. Neither will all hosts function equally well with the same expression system. 
However, one of skill in the art may make a selection among these vectors, expression control 
sequences and hosts without undue experimentation. For example, in selecting a vector, the 

10 host must be considered because the vector must replicate in it or be able to integrate into the 
chromosome. The vector's copy number, the ability to control that copy number, and the 
expression of any other proteins encoded by the vector, such as antibiotic markers, should also 
be considered. In selecting an expression control sequence, a variety of factors should also be 
considered. These include, for example, the relative strength of the sequence, its controllability, 

15 and its conq)atibility with the nucleotide sequence encoding the polypeptide, particularly as 
regards potential secondary structures. Hosts should be selected by consideration of their 
oompatibility with the chosea vector, the toxicity of the product coded for by the nucleotide 
sequence, their secretion characteristics, their ability to fold the polypeptide coirectly, their 
fermentation or culture requirements, and the ease of purification of the products coded for by 

20 the nucleotide sequence. 

The recombinant vector may be an autonomously replicating vector, i.e. a vector 
existing as an extrachromosomal entity, the replication of which is independent of 
chromosomal replication, e.g, a plasmid. Altemativdy, the vector is one which, when 
introduced into a host cell, is integrated into the host cell genome and r^licated together with 

25 the chromosome(s) into which it has been integrated. 

The vector is pieftarably an expression vector, in which the nucleotide sequence 
encoding the polypeptide of the invention is operably linked to additional segments required 
for transcription of tiie nucleotide sequence. Hie vector is typically derived from plasmid or 
viral DNA. A number of suitable expression vectors for expression in the host cells mentioned 

30 herein are commercially available or described in the literature. Useful expression vectors for 
eukaryotic hosts, include, for example, vectors comprising expression control sequences from 
SV40, bovine papilloma virus, adenovirus and cjftomegalovirus. Specific vector are, e.g., 
pCDNA3.1(+)\Hyg (Invitrogen, Carlsbad, CA, USA) and pCI-neo (Stratagene, La Jolla, CA, 
USA). Useful expression vectors for yeast cells include the 2^ plasmid and derivatives thereof. 
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the POTl vector (US 4,931,373), the pJS037 vector described in (Okkds, Ann. New Yoik 
Acad. Sci. 782, 202-207, 1996) andpPICZ A, B or C (ihvitrogen, Carlsbad, CA, USA). Useful 
vectors for insect cells include pVL941, pBGSll (Gate et ai., "Isolation of the Bovine and 
Human Genes for MuUerian Inhibiting Substance And Expression of the Human Gene In 
5 Animal Cells", Cell, 45, pp. 685-98 (1986), pBluebac 4.5 and pMelbac (both available from 
Ihvitrogen, Carlsbad, CA, USA). 

Other vectors for use in this invention include those that allow the nucleotide sequence 
encoding the polypeptide of the invention to be amplified in copy number. Such amplifiable 
vectors are well laiown in the art. They include, for example, vectors able to be amphfied by 

10 DHER amplification (see, e.g., Kaufman. U.S. Pat. No. 4,470,461, Kaufman and Sharp, 
"Construction Of A Modular Dihydrafolate Reductase cDNA Gene: Analysis Of Signals 
Utilized For Efficient Expression", Mol. Cell. Biol., 2, pp. 1304-19 (1982)) and glutamine 
synthetase ("GS") amplification (see, e.g., US 5,122,464 and EP 338,841). 

The recombinant vector may further comprise a DNA sequence enabling the vector to 

15 replicate in the host cdl in question. An example of such a sequence (when the host cell is a 
mammalian cell) is the SV40 origin of replication. When the host cell is a yeast cell, suitable 
sequences enabling the vector to replicate are the yeast plasmid 2\i replication genes REP 1-3 
and origm of replication. 

The vector may also comprise a selectable marker, e.g. a gene the product of which 

20 complements a defect in the host cell, such as the gene coding for dihydrofiQlate reductase 
pHER) or the Schizosaccharomyces pombe TFI gene (described by P Jl. Russell, Gene 40, 
1985, pp. 125-130), or one which confers resistance to a drug, e.g. ampicillin, kanamydn, 
tetracyclin, chloramphenicol, neomycin, hygramycin or methotrexate. For filamentous fimgi, 
selectable mariicers include amdS. pyrG, arcB, niaP. sC. 

25 The term "control sequences" is defined herein to include aU components, which aie 

necessary or advantageous for the expression of the polypeptide of the invention. Each control 
sequence may be native or foreign to the nucleic acid sequence encoding the polypeptide. Such 
control sequences include, but are not limited to, a leader, polyadenylation sequence, 
propeptide sequence, promoter, enhancer or upstream activating sequence, signal peptide 

30 sequence, and transcription terminator. At a minimum, the control sequences include a 
promoter operably linked to the nucleotide sequence encoding the polypeptide. 

"Operably Unked" refers to the covalent joining of two or more nucleotide sequences, 
by means of enzymatic ligation or otherwise, in a configuration relative to one another such 
that the normal function of the sequences can be performed. For exarr^jle, the nucleotide 
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sequence encoding a piesequence or secretory leader is operably linked to a nucleotide 
sequence for a polypeptide if it is expressed as a preprotein that participates in the secretion of 
the polypeptide: a promoter or enhancer is operably linked to a coding sequence if it affects tiie 
transcription of the sequence; a ribosome binding site is operably linked to a coding sequence 
5 if it is positioned so as to facilitate translation. Generally, "operably linked" means that the 
nucleotide sequences being linked are contiguous and, in the case of a secretory leader, 
contiguous and in reading phase. Linking is accomplished by ligation at convenient restriction 
sites. If such sites do not exist, then synthetic oligonucleotide adaptors or linkers are used, in 
conjunction with standard recombinant DNA methods. 
10 A wide variety of expression control sequences may be used in the present invention. 

Such useful expression control sequences include the expression control sequences associated 
with structural genes of the foregoing expression vectors as well as any sequence known to 
control the expression of genes of prokaryotic or eukaryotic cells or their viruses, and various 
combinations thereof. 

15 Examples of suitable control sequences for directing tnanscdption in mammaKan cells 

include the early and late promoters of S V40 and adenovirus, e.g. the ailenovims 2 major late 
promoter, the MT-1 (metallothionein gene) promoter, the himian cytomegalovirus immediate- 
early gene promoter (CMV), the human elongation factor la (EF-la) promoter, the 
Drosophila minimal heat shock protein 70 promoter, the Rous Sarcoma Vims (RSV) promoter, 

20 the human uWquitin C (UbC) promoter, the human growth hormone terminator, S V40 or 
adenovirus Elb region polyadenylation signals and the Kozak consensus sequence (Kozak, M. 
JMolBiol 1987 Aug 20;196(4):947-50). 

In order to improve expression in nofflmmalian cells a synthetic intron may be inserted in 
the 5' untranslated region of the nucleotide sequence encoding the polypeptide of the 

25 invention. An example of a synthetic intron is the synthetic intcon firam the plasmid pOE-Neo 
(available jBrom Promega Corporation, WI, USA). 

Examples of suitable control sequences for directing transcription in insect cells include 
the polyhedrin promoter, the PIO promoter, Has Autographa ccdifomica polyhedrosis vims 
basic protein promoter, the.baculovirus immediate early gene 1 promoter and the baculovirus 

30 39K delayed-early gene promoter, and the SV40 polyadenylation sequence. 

Examples of suitable control sequences for use in yeast host cells include the promoters 
of the yeast o-mating sjfstem, the yeast triose phosphate isonMrase CTPI) promoter, promoters 
from yeast glycolytic genes or alcohol dehydogenase genes, the ADH2-4c promoter and the 
inducible GAL promoter. 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



45 

Examples of suitable control sequences for use in filamentous fungal host cells include 
the ADH3 promoter and terminator, a promoter derived firam the genes encoding AspergiUus 
oryzjae TAKA amylase teiose phosphate isomerase or alkaline protease, an A. viger a-amylase, 
A. nigerotA. mdulam glucoamylase, A nidulans acetamidase, Rhizomucor miehei aspartic 
5 proteinase or lipase, the TPIl terminator and the ADH3 terminator. 

The nucleotide sequence of the invention may or may not also include a nucleotide 
sequence that encode a signal peptide. The signal peptide is present when the polypeptide is to 
be secreted from the cells in which it is expressed. Such signal peptide, if present, should be 
one recognized by the cell chosen for expression of the polypeptide. The signal peptide may be 

10 homologous (e.g. be that normally associated with the parent polypeptide in question) or 
heterologous (i.e. originating from another source than the parent polypeptide) to the 
polypeptide or may be homologous or heterologous to the host cell, i.e. be a signal peptide 
normally expressed from the host cell or one which is not normally expressed from the host 
cell. Accordingly, the signal peptide may be prokaryotic, e.g. derived from a bacterium, or 

15 eukaryotic, e.g. derived from a mammalian, or insect, filamentous fungal or yeast cell. 

The presence or absence of a signal peptide will, e.g., depend on the expression host 
cell used for the production of the polypeptide, the protein to be expressed (whether it is an 
intracellular or extracelluar protein) and wheliier it is desirable to obtain secretion. For use in 
filamentous fungi, the signal peptide may conveniently be derived from a gene encoding an 

20 Aspergillus sp. amylase or glucoamylase, a gene encodmg SLShizomucor miehei Kpase or 
protease or a Humicola lanuginosa lipase. The signal peptide is preferably derived from a gene 
encoding A oryzae TAKA amylase, A niger neutral cc-amylase, A ra^eradd-stable amylase, 
or A niger glucoamylase. For use in insect cells, the signal peptide may conveniently be 
derived from an insect gene (of. WO 90/05783), such as the lepidopteran Manduca sexta 

25 adipokinetic hormone precursor, (cf. US 5,023,328), the honeybee melittin (hivitrogen, 
Carlsbad, CA, USA), ecdysterdd UDPglucosyltransferase (egt) (Miirphy et al.. Protein 
Expression and Purification 4, 349-357 (1993) or human pancreatic lipase (hpl) (Methods in 
Enzymology 284, pp. 262-272, 1997). 

Specific examples of signal peptides for use in mammalian cells include that of human 

30 glucocerebrosidase apparent from the examples hereinafter or the murine Ig kappa light chain 
signal peptide (Coloma, M (1992) J. Imm. ^fcthods 152:89-104). For use in yeast cells suitable 
signal peptides have been found to be the ct-factor signal peptide from S. cereviciae. (cf. US 
4,870,008), the signal peptide of mouse salivary amylase (cf. O. Ifegenbuchle et al., 

Nature 289, 1981, pp. 643-646), a modified carboxypeptidase signal peptide (cf. L.A. 
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Vails et al., Cell 48, 1987, pp. 887-897), the yeast BARLsignal peptide (cf, WO 87/02670), 
and the yeast aspartic protease 3 (YAPS) signal peptide (cf. M. Egel-Mtani et al.. Yeast 6, 
1990, pp. 127-137). 

Any suitable host may be used to produce the polypeptide of the invention, including 
5 bacteria, fungi (including yeasts), plant, insect, mammal, or other appropriate animal cells or 
cell Hnes, as well as transgenic animals or plants. When a non-glycosylating organism such as 
E. coli is used, and the polypeptide is to be a glycosylated polypeptide, the expression in E. coli 
is preferably followed by suitable in vitro glycosylation. 

Examples of bacterial host cells include grampositive bacteria such as strains of 
. 10 Bacillus, e.g. B. brevis or B. subtilis, Pseudomonas or Streptomyces, or gramnegative bacteria, 
such as strains of E. coli. The introduction of a vector into a bacterial host cell may, for 
instance, be effected by protoplast transformation (see, e.g., Chang and Cohen, 1979, 
Molecular General Genetics 168: 111-115), using competent cells (see, e.g.. Young and 
Spizizin, 1961, Journal of Bacteriology 81: 823-829, orDubnau and Davidoff-Abelson, 1971, 
15 Journal of Molecular Biology 56: 209-221), electroporation (see, e.g., Shigekawa and Dower, 
1988, Biotechniques 6: 742-751), or conjugation (see, e.g., ECoehler and Thome, 1987, Journal 
of Bacteriology 169: 5771-5278). 

Examples of suitable filamentous fimgal host cells include strains of AspergiUus, e.g. A, 
oryzae, A. niger, or A nidulans, Fusarium or Trichoderma. Fungal cells may be transformed 
20 by a process involving protoplast formation, transfoimation of the protoplasts, and regeneration 
of the cell wall in a manner known perse. Suitable procedures for transformation of 
Aspergillus host cells are described in EP 238 023 and US 5,679,543. Suitable methods for 
transfonning Fusarium species are described by Malaidier et al., 1989. Gem 78: 147-156 and 
WO 96/00787. Yeast may be transfbnned using the procedures described by Becker and 
25 Guarente, In Abelson, and Simon, M.L, editors. Guide to Yeast Genetics and Molecular 
Biology, Methods in Enzyntology, Volume 194, pp 182-187, Academic Press, Inc., New York; 
Ito etal., 1983, Journal ofBacteriologySSS: 163; andHinnen etal, 1978, Proceedings of die 
N^ioned Academy cf Sciences USA 75: 1920. 

When the polypeptide of the invention is to be in vfvo glycosylated, the host cell is 
30 selected from a group of host cells capable of generating the desired glycosylation of the 
polypeptide. Thus, the host cell may advantageously be selected from a yeast cell, insect cell, 
or mammalian cell. 

Examples of suitable yeast host cells include strains oiSaccharomyces, e.g. S. cerevisiae, 
SchizosaccJiaroniyces, KJyveromyces, Pichia, such as P. pastoris or P. methanolica. 
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Hansenula, such as H. polymorpha or yarrowia. Of particular interest ace yeast glycosylation 
mutant cells, e.g. dedved firom S. cereviciae, P. pastoris oi Hansenula spp. (e.g. the S. 
cereviciae glycosylation mutants ochl, ochi mmnl or ochl nmml algS described by Nagasu et 
al. Yeast 8, 535-547, 1992 and Nakanisho-Shiiido et al. J. Biol. Chem. 268, 26338-26345, 
5 1993). Methods for transforming yeast cells with heterologous DNA and producing 
heterologous polypeptides thereficom are disclosed by Clontech Laboratories, Inc, Palo Alto, 
CA, USA (in the product protocol for the Yeastmaker""^ Yeast Tranformation System Kit), and 
by Reeves et al., EEMS Microbiology Letters 99 (1992) 193-198, Manivasakam and Schiesti, 
Nucleic Adds Research, 1993, Vol. 21, No. 18, pp. 4414-4415 and Ganeva et al., EEMS 

10 Microbiology Letters 121 (1994) 159-164. 

Examples of suitable insect host cells include a Lepidoptora cell line, such as 
Spodoptera frugiperda (SB or Sf21) or Trichoplusia ni cells (High Five) (US 5,077,214). 
Transformation of insect cells and production of heterologous polypeptides therein may be 
performed as described by Invitrogen, Carlsbad, CA, USA. 

15 Examples of suitable mammalian host cells include Chinese hamster ovary (CHO) cell lines, 
(e.g. CHO-Kl; ATCC CCL-61), Green Monkey cell lines (COS) (e.g. COS 1 (ATCC CRL- 
1650), COS 7 (ATCC CRL-1651)); mouse cells (e.g. NS/0), Baby Hamster Kidney (BHK) cell 
lines (e.g. ATCC CRL-1632 or ATCC CCL-10), and human cells (e-g. HEK 293 (ATCC CRI^ 
1573)), as well as plant cells in tissue culture. Additional suitable cell lines are known in the art 

20 and available from public depositories such as liie American Type Culture Collection, 

Rockville, Maryland. Of interest for the present purpose are a mammalian glycosylation mutant 
ceU Une, such as CHO-LECl, CH0L-LEC2 or CHO-LEC18 (CHO-LECl: Stanley et d. Proc. 
Natl. Acad. USA 72, 3323-3327, 1975 and Grossmann et al., J. Biol. Chem. 270, 29378-29385, 
1995, CH0-LEC18: Raju et al. J. Biol. Chem. 270, 30294-30302, 1995). 

25 Meliiods for introducing exogerieous DNA into mammalian host cells include calcium 

phosphate-mediated transfection, electsoporation, DEAE-dextran mediated transfection, 
liposome-mediated transfection, viral vectors and the transfection method described by Life 
Technologies Ltd, Paisley, UK using lipofectamin 2000. These methods are well known in the 
art and e.g. described by Ausbel et al. (eds.), 1996, Current Protocols in Molecular Biology, 

30 John Wiley & Sons, New York, USA. The cultivation of mammalian cells are conducted 
according to established methods, e.g. as disclosed in (Animal CeU Biotechnology, Mefliods 
and Protocols, Edited by Nigel Jenkins, 1999, Human Press Inc, Totowa, New Jeraey, USA 
and Harrison MA and Rae IF, General Techniques of Cell Culture, Cambridge University Press 
1997). 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCTfl»K01/00459 



48 

In the production methods of the present invention, cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art. For 
example, cells are cultivated by shake flask cultivation, small-scale or large-scale fermentation 
(including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial 
5 fermenters performed in a suitable medium and under conditions allowing the polypeptide to 
be expressed and/or isolated. The cultivation takes place in a suitable nutrient medium 
comprising carbon and nitrogen sources and inorganic salts, using proceduies known in the art. 
Suitable media are available from commercial suppliers or may be prepared according to 
published compositions (e.g. , in catalogues of the American Type Culture Collection). E the 

10 polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly 
ftom the naedium. If the polypeptide is not secreted, it can be recovered fiom cell lysates. 

The resulting polypeptide may be recovered by methods known in the art. For 
example, the polypeptide may be recovered fitom the nutrient medium by conventional 
procedures including, but not limited to, centrifugation, filtration, extraction, spray drymg, 

15 evaporation, or precipitation. 

The polypeptides may be purified by a variety of procedures known in the art including, 
but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, 
chromatofocusing, and size exclusion), electrophoietic procedures ie.g., preparative isoelectric 
focusing), differential solubility (e.g., ammonium sulfate precipitation), SDS-PAGE, or 

20 extraction (see, e.g.. Protein Purification, J-C Janson and Lars Ryden, editors, VCH 
PubUshers, New York, 1989). 

Other methods of the invention 

25 In accordance with a specific aspect a nucleotide sequence encoding the polypeptide of the 
invaition is prepared by a method comprising 

a) subjecting a nucleotide sequence encoding the polypeptide Pp to elongation mutagenesis, 

b) expressing the mutated nucleotide sequence obtained in step a) in a suitable host cell, 
optionally 

30 c) conjugating polypeptides expressed in step b) to a second non-peptide moiety, 

d) selecting polypq)tides of step b) or c) which comprises at least one oligosaccharide moiety 
and optionally second non-peptide moiety attached to the peptide addition part of the 
polypeptide, and 

e) isolating a nucleotide sequence encoding the polypeptide selected in step d). 
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In the present context the term "elongation mutagenesis" is intended to indicate any 
manner in which the nucleotide sequence encoding the parent polypeptide Pp can be extended 
to further encode the peptide addition. For instance, a nucleotide sequence encoding a peptide 
addition of a suitable length may be synthesized and fused to a nucleotide sequence encoding 
5 flie polypeptide Pp: The resulting fused nucleotide sequence may then be subjected to fiuther 
modification by any suitable method, e.g. one which involves gene shuffling, other 
recombination between nucleotide sequences, random mutagenesis, random elongation 
mutagenesis or any combination of these methods. Such methods are further described in the 
Methods section herein. 
10 The expression and optional conjugation steps are conducted as described in further 

detail elsewhere in fee present application, and the selection step d) using any suitable method 
available in the art 

Li one embodiment the above method further comprises screening polypeptides 
resulting from step b) or c) for at least one improved property, in particular any of those 

15 improved properties listed herein, prior to the selection step, and wherein the selection step d) 
further comprises selecting polypeptides having such improved property. 

Furthermoie, in the above method the elongation mutagenesis can be conducted so as to 
enrich for codons encoding a glycosylation site and/or an amino acid residue comprising an 
attachment group for a second non-peptide moiety., in particular an in vivo glycosylation site. 

20 Still further, the above method can comprise subjecting the part of the nucleotide ■ 

sequence encoding the polypeptide Pp of interest to mutagenesis to remove and/or introduce 
glycosylation siteCs) and/or amino add residue(s) conqoising an attachment group for the 
second non-peptide inoiety. The nucleotide sequence may be subjected to any type of 
mutagenesis, e.g. any of those described herein. The mutagpnesis of the nucleotide sequence 

25 encodmg the polypeptide Pp of interest can be conducted prior to assembling the sequence 
with that encoding the peptide addition, concomitantly with or after any mutagenesis of the 
peptide addition part of the assembled nucleotide sequence. 

In a further aspect, the invention relates to a mefliod of producing a glycosylated 
polypeptide encoded by a nudeotide sequence of the invention prepared by the above method, 

30 wherdn the nudeotide sequence encoding the polypeptide sdected in step c) is expressed in a 
glycosylatmg host cell and the resulting glycosylated expressed polypeptide is recovered. 

In a still further aspect the invention relates to a method of improving one or more 
selected properties of a polypeptide Pp of interest, which metiiod comprises 
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a) preparing a nucleotide sequence encoding a polypeptide comprising or consisting 
essentially of the primary stractuie 

NH^-X-Pp-COOH, 

5 

wherein 

X is a peptide addition comprising or contributing to a glycosylation site and/or an attachment 
group for a second non-peptide moiety that is capable of conferring the selected improved 
property/ies to the polypeptide Pp, 
10 b) expressing the nucleotide sequence of a) in an suitable host cell, optionally 

c) conjugating the expressed polypeptide of b) to a second non-peptide moiety, and 

d) recovering the polypeptide resulting from step b) or c). 

For instance, the polypeptide is any of those described herein. For instance the 
nucleotide sequence of step a) is prepared by subjecting a nucleotide sequence encoding the 

15 polypeptide Pp to elongation mutagenesis, e.g. to enrich for codons encoding an amino acid 
residue comprising or contributing to a glycosylation site and/or an attachment group for a 
second non-peptide moiety, in particular an in vivo glycosylation site. Also, in the preparation 
of the nucleotide sequence of a), the part of the nucleotide sequence encoding the polypeptide 
Pp can be subjected to mutagenesis to remove and/or introduce glycosylation site(s) and/or 

20 attachment group(s) for a second non-peptide moiety. 

The method according to this aspect can further comprise a screening step (after step 
c)), wherein the polypeptide resulting ftom step b) or c) is screened for one or more improved 
properties, in particular any of those improved properties which are described hereinabove. 
Usually, when a polypeptide has been selected in a screening step of a method of the 

25 invention the nucleotide sequence encoding the polypeptide is isolated and used for expression 
of larger amounts of the polypeptide. The amino add sequence of the resulting polypeptide is 
determined and the polypeptide may be subjected to conjugation in a larger scale. 
Subsequently, flie polypeptide is assayed with respect to the property to be improved. 

30 Uses of a polypeptide of the invention 

It will be understood that polypeptides of the invention can be used for a variety of purposes, 
depending on the type and nature of polypeptide. For instance, it is contemplated that a 
polypeptide of the invention prepared fcom a ther^eutic polypeptide is useful for the same 
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therapeutic purposes as the parent polypeptide, i.e. for the treatment of a particular disease. 
Accordingly, the polypeptide of the invention maybe formulated into a pharmaceutical 
composition. Also, when the polypeptide of the invention is an in vivo glycosylated 
polypeptide which does not comprise any other type of non-peptide moiety, a nucleotide 
5 sequence encoding the polypeptide can be used in gene therapy in accordance with established 
principles. When the polypeptide Pp is an antigen the polypeptide of the invention may be 
provided in the form of a vaccine. 




Nucleotide sequence modification methods 

For example, a peptide addition may be constructed from two or more nucleotide sequences 
encoding a polypeptide of interest with a peptide addition, the sequences being sufficiently 

15 homologous to allow recombination between the sequences, in particular in the part thereof 
encoding the peptide addition. The combination of nucleotide sequences or sequence parts is 
canveniently conducted by methods known in the art, for instance methods which involve 
homologous cross-over such as disclosed in US 5,093,257, or methods which involve gene 
shufOing, i.e., recombination between two or more homologous nucleotide sequences resulting 

20 in new nucleotide sequences having a number of nucleotide alterations when compared to the 
startmg nucleotide sequences. In order for homology based nucleic acid shuffling to take place 
the relevant parts of the nucleotide sequences are preferably at least 50% identical, such as at 
least 60% identical, more preferably at least 70% identical, such as at least 80% identical. The 
recombination can be performed in vitro or in vivo. Examples of suitable in vitro gene 

25 shuffling methods are disclosed by Stemmer et al (1994), Proc. Natl. Acad. Sci. USA; vol. 91, 
pp. 10747-10751; Stemmer (1994), Nature, vol. 370, pp. 389-391; Smith (1994), Nature vol. 
370, pp. 324-325; Zhao et al., Nat. Biotechnol. 1998, Mar; 16(3): 258-61; Zhao H. and Arnold, 
FB, Nucleic Acids Research, 1997, Vol. 25. No, 6 pp. 1307-1308; Shao et al.. Nucleic Adds 
Research 1998, Jan 15; 26(2): pp, 681-83; and WO 95/17413. Exan^le of a suitable in vivo 

30 shufEling method is disclosed in WO 97/07205. 

Fttrihermore, a peptide addition can be constructed by preparing a randomly 
mutagenized library, conveniently pttepared by subjecting a nucleotide sequence encoding the 
polypeptide of the invention or the peptide addition to random mutagenesis to create a large 
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number of mutated nucleotide sequences. While the random mutagenesis can be entirely 
random, both with respect to where in the nucleotide sequence the mutagenesis occurs and with 
respect to the nature of mutagenesis, it is preferably conducted so as to randomly mutate only 
the part of the sequence that encode the peptide addition. Also, the random mutagenesis can be 

5 directed towards introducing certain types of amino acid residues, in particular amino add 
residues containing an attachment group, at random into the polypeptide molecule or at random 
into peptide addition part thereof. Besides substitutions, random mutagenesis can also cover 
random introduction of insertions or deletions. Preferably, the insertions are made in reading 
frame, e.g., by performing multiple intooduction of three nucleotides as described by HaJIet et 

10 al.. Nucleic Acids Res. 1997, 25(9): 1866-7 and Sondek and Shrode, Pioc Natl. Acad. Sd USA 
1992, 89(8):358 1-5. 

The random mutagenesis (either of the whole nucleotide sequence or more preferably 
the part thereof encoding the peptide addition) can be performed by any suitable method. For 
example, the random mutagenesis is performed using a suitable physical or chemical 

15 mutagenizmg agent, a suitable oligonucleotide, PGR generated mutagenesis or any 

combination of these mutagenizing agentsand/or other methods according to state of the art 
technology, e.g. as disclosed in WO 97/07202, 

Eaxxc prone PGR generated mutagenesis, e.g. as described by J.O. Deshler (1992), 
GATA 9(4): 103-106 and Leung et al.. Technique (1989) Vol. 1, No. 1, pp. 11-15, is 

20 particularly useful for mutagenesis of longer peptide stretches (conesponding to nucleotide 
sequences containing more than 100 bp) or entire genes, and are prefisrably peifoimed under 
conditions that increase the mi8incorpo[ration of nucleotides. 

Random mutagenesis based on doped or spiked oligonucleotides or by spedfic 
sequence oligonucleotides, is of particular use for mutagenesis of the part of the mideotide 

25 sequence encoding tiie peptide addition. 

Random mutagenesis of the part of the nucleotide sequence encoding the peptide 
addition can be performed using PCR generated mutagoiesis, in which one or more suitable 
oligonucleotide primers flanking the area to be mutagenized are used. In addition, doping or 
spiking with oligonucleotides can be used to introduce mutations so as to remove or introduce 

30 attachment groups for the relevant non-peptide moiety. State of the art knowledge and 
computer programs (e.g. as described by Siderovski DP and Mak TW, Comput. Biol. Med. 
(1993) Vol. 23, No. 6, pp. 463-474 and Jensen et al. Nuddc Acids Research, 1998, Vol. 26, 
No. 3) can be used for calculating the most optunal nucleotide mixture for a given amino add 
preference. The oligonucleotides can be incorporated into the nucleotide sequence encoding the 
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peptide addition by any published technique using e.g. PGR, LCR or any DNA polymexase or 
Ugase. 

According to a convenient PGR method the nucleotide sequence encoding the 
polypeptide of the invention and in particular the peptide addition thereof is used as a template 
5 and, e.g., doped or specific oligonucleotides ace used as primers. In addition, cloning primers 
localized outside the targetted region can be used. The resulting PGR product can either 
directly be cloned into an ^propriate expression vector or gel purified and amplified in a 
second PGR reaction using the cloning primers and cloned into an appropriate expression 
vector, 

10 In addition to the random mutagenesis methods described herein, it is occasionally 

useful to employ site specific mutagenesis techniques to modify one or more selected amino 
acids in the peptide addition, in particular to optimdse the peptide addition with respect to the 
number of attachment groups. 

Furthermore, random elongation mutagenesis as described by Matsuura et al, op cit can 
15 be used to construct a nucleotide sequence encoding a polypeptide having a C-teiminal peptide 
addition. Construction of a nucleotide sequence encoding the polypeptide of the invention 
having an N-terminal peptide addition can be constructed in an analogous way. 

Also, the methods disclosed in WO 97/04079, the contents of which are incarporated 
herein by reference, can be used for constructing a nucleotide sequence encoding the 
20 polypeptide of the invention. 

The nucleotide sequence(s) or nucleotide sequence region(s) to be mutagenized is 
typically present on a suitable vector such as a plasmid or a bacteriophage, which as such is 
incubated with or otherwise e;q)osed to the mutagenizing agent Hie nucleotide sequence(s) to 
be mutagenized can also be present in a host cell either by being integrated into the genome of 
25 said cell or by being present on a vector harboured in the cell. Alternatively, the nucleotide 
sequence to be mutagenized is in isolated form. The nucleotide sequence is preferably a DNA 
sequence such as a cDNA, genomic DNA or synthetic DNA sequence. 

Subsequent to the incubation with or exposure to the mutagenizing agent, liie mutated 
nucleotide sequence, normally in amplified form, is expressed by culturing a suitable host cell 
30 carrying the nucleotide sequence under conditions allowing expression to take place. The host 
cell used for this purpose is one, which has been transformed with the mutated nucleotide 
sequence(s), optionally present on a vector, or one which carried the nucleotide sequence 
during the mutagenesis, or any kind of gene library. 



SUBSTITUTE SHEET {RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



54 

Design of peptide addition 

One example of a useful guide for designing an N-tenninal peptide addition containing N- 
glycosylation sites is characterized by the following formula: 
Xi(NX2|T/S])X3CNX2[T/S])„X4-Pp 

5 wherein each of Xi, Xs and X4 independently is ^ent or 1, 2, 3 or 4 amino acid residues of 
any type, X2 a single amino acid residue of any type except for proline, n any integer between 0 
and 6, [T/S] a threonine or serine residue, preferably a threonine residue, and N and Pp has the 
meaning defined elsewhere herein. It has been found that sometimes the nature of the amino 
acid residue occupying position -1 to -4 relative to the N-residue of an N-glycosylation site 

10 may be important for the degree to which said N-glycosylation site is used. Accordingly, Xi, 
X3, and X4 may be chosen so as to obtain an increased utilization of the relevant site (as 
determined by a trial and error type of experiment). In a first step about 10 different mutdns 
are made that has the above formula. For instance, the about 10 muteins are designed on the 
basis that each of Xi, X3 and Xjindependently is 1 or 2 alanme residues or is absent, Z any 

15 integer between 0 and 5, [T/S] threonine, and Alanine. Based on, e.g., in vitro bioactivity and 
half-life results obtained with these mutdns (or any other relevant property), optimal 
number(s) of amino acids and glycosylation(s) can be determined and new muteins can be 
constructed based on this infonnation. The process is repeated until an cptimal glycosylated 
polypeptide is obtained. 

20 Altematiyely, random mutagenesis may be used for creating N-terminally extended 

polypeptides. For instance, a random mutagenized library is made on the basis of the above 
formula. Doped oligonucleotides are synthesized coding for one amino add residue in position 
B (the amino acid residue being different from proline), each of Xi, X3, and X4 independently is 
0, 1 or 2 amino acid residues of any type, n is 2 and T is threonine and used for constructing 

2S the random mutagenizedlihcaiy. 

One example of a useful guide for designing an N-teiminal peptide addition containing 
a PEGylation attachment group is charactenzed by the following formula using a lysme residue 
as an example of a PEGylation site. It will be understood tiiat peptide additions witii other 
attachment groups can be designed in an analogous way. 

30 

Y^(E[)Y^(K)„Y^-Pp. 

wherein each of Y^, Y^ and Y' independently is 0, 1, 2, 3 or 4 amino add residues of any type 
except lysine, n an integer between 0 and 6, K lysine, and Pp is as defined elsewhere herein. 
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M a first step about 10 different muteins are made that has the above fommla. For 
instance, the about 10 muteins are designed on the basis tisat each of yS and 
independently is 1 or 2 alanine residues or is absent, n any integer between 0 and 5. The 
muteins are then I*EGylated withlO kDa PEG (e.g, using mPEGSPA). Based on, e.g., m ^fitro 

5 bioactivity and half-life results obtamed with these muteins (or any other relevant property), 
optimal numbers) of amino acids and PEGylation sites can be deteimined and new muteins 
can be constructed based on this infbnnation. The process is repeated until an optimal 
FBGylated polypeptide is obtained. 

Alternatively, random mutagenesis may be performed by making a random 

10 mutagenized library based on the aibove fommla. Doped oligonucleotides are synthesized 
coding for one amino add residue in position Y^, Y^ and/or Y^ independently is 0, 1 or 2 
amino acid residues of any type, and n is 2 and used for constructing the random mutagpnized 
library. 

15 Glucocerehrosidase (GCB) Activity Assay using PNP-glucopyranoside substrate 
The enzymatic activity of recombinant GCB is measured using p-nitrophenyl-p-D- 
glucopyranoside (PNP-Glu) as a substrate. Hydrolysis of the PNP-Glu substrate generates p- 
nitrophenyl, which can be quantified by measuring absorption at 405 nm using a 
spectrophotometer, as previously described (Friedmann et al., 1999, Blood 93; 2807-2816). 

20 The assay is carried out under conditions which partially inhibit non-GCB glucosidase 
activities, such conditions being achieved by using a phosphate/citrate buffer plt=5.5, 0.25 % 
Tnton X-lOO and 0.25 % taurodholate. 

The assay is run in a final volume of 200 nl, containing GCB Activity Assay Buffer and 
4 mM PNP-Glu. The enzymatic hydrolysis is initiated by adding GCB and the reaction is 

25 allowed to proceed for 1 hour at 3TC befbie being stopped by adding 50 iHl M NaOH and 
measuring absorption at 405 nm. A reference standard curve of p-nitiophenjd, assayed in 
parallel, is used to quantify concentrations of GCB in samples to be tested. 

In vitro uptake and stability of GCB polypeptide in macrophages 
30 The murine monocyte/macrophage cells line, J774E (Muldiopadhyay and Stahl, Arch 

Biochem Biophys 1995 Dec 1 ;324(l):78-84 and Diment et al., J Leukoc Biol 1987 
Nov;42(5):485-90) is used to study the uptake and stability of GCB polypeptides. Cells are 
grown in alpha-MEM (supplemented witii 10 % fetal calf serum, IX Pen/Strep, and 60 juM 6- 
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thioguanine), seeded (200,000 cells pr. weU) in the above-mentioned media containing 10 
conditol B epoxide, CBE (an iireveisible GCB inhibitor) and incubated for 24 hr at SVC. 

Before starting the uptalje assay, cells are washed in 0.5 ml BBSS (Hanks balanced salt 
solution). The uptake is done in a 200 (xl volume, containing the appropriate concentration of 
5 GCB polypeptide (a dosis response curve is made with GCB concentrations in the range of 25- 
400 mU/ml). As a control, yeast mannan (final concentration 1.4 mg/ml) is added to inhibit Ihe 
uptake through the macrophage mannose receptor. The cells are incubated for 1 hr at 37°C and 
washed three times with 0.5 ml cold BBSS. 

To measure the amount of GCB takwi up by the J774E cells, cells are lyzed in 200 
10 GCB Activity Assay Buffer with 4 mM PMP-Glu and incubated for 1 hr at 37°C. Then, the 
hydrolysis is stopped by addition of 50 |d IM NaOH and OD405 is measured. The data are 
analysed by non-linear regression using GraphPad Erizm 2.0 (GraphPad Software, San Diego, 
CA) 

To study Ihe stabiUty of GCB polypeptides in T774E ceils. CBE treated cdls are 
15 incubated with 400 mU/ml GCB for 1 hr at 37''C. Then. ceUs are washed 3 times in BBSS to 
remove extracellular GCB and incubated in BBSS. A timeH30uise study is done by lyzing the 
ceUs after 30 mjn, 1 hr, 2 hr, 3hr. 4 hr, and 5 hr in 200 pi GCB Activity Assay Buffer with 
4mM PNP-Glu and incubating the samples for 1 hr at 37''C before stopping the hydrolysis with 
50 nil M NaOH and measuring OD405. The data are analysed by non-linear regression using 
20 GraphPad Prizm 2.0 (GraphPad Software, San Diego, CA). 

Site-directed mutagenesis 

Constructions of site-directed mutations were performed using PGR with oligonucleotides 
containing the desired amino acid exchanges or additions (e.g. to introduce glycosylation sites). 
25 The resulting PGR fragment was cloned into the GCB expression vector using approparite 
restriction enzymes and subsequently DNA sequenced in order to confinn that the construct 
contained the desired exchanges. 

30 MATEEGALS 

GCB Activity Assay Bt^er. 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



57 

120 mM phosphate^citrate buffer, pH=:5.5, 1 inM EDTA, pH=8.0, 0.25 % Ittton X-100, 0.25 
% taurocholate, 4 mM P-mercaptoethanol 

pGC-12 vector 

5 pYLi392 (Phaimingpn, USA) with GCB wt cDNA sequence (SEQ ID NO 2) inserted between 
EcoRVandXbal. 

Table 1 

Sequence of primers used for datiing the wt GCB coding region and inserting signal peptides 
10 into the pGCBmat plasmid as described in Example 1. 

5049 (WT-sp-BgUO: 5'-CGCAGATCTGATGGCTGGCAGCCrCACAGGATTGC-3* 

5050 (WT-stop-EcoRT): 5'-CCGGAATTCXX:ATCACTGGCGACGCCACAGGTAGGTG-3' 

5051 (WT-mature-SacI): 5'-ACBa3^AGCTOGCC(XTGCATCCCTAAAAGm 
15 S052 (SPegt-Nhel/SacI-as): 5'- 

GCGTTGACGGCAGTCAGAGTrGACAGAAGGGCCAGCCAGCAAAGGATAGTCATG- 
3' 

5053 (SPegt-Nhel/SacI-s): 5'- 

CTAGCATGACTATCCrTTGCTGGCTGGCCCTrcnmCAACrC^ 
20 CAGCT-3' 

5054 (SPegt-Nhel/SacI-as): 5'- 

CCTGCTACTGCTCCCAGCAGCAGTGAAAGAGTCCAAAGTGGCAGCATG-3' 

5055 (SPegt-Nhel/SacI-s): 5 - 

CTAGCATGCTGCCACnTrGGACTCTTTCACTGCTGC^^ 
25 -3- 

Cerezyme was kindly provided by Dr. E. Beutler, Scripps Institute, CA, USA. 
J77^E was Idndly provided by G, GrabowsW, Ciiicinnati, Ohio, US 

30 

EKAMFlEl 



SUBSTITUTE SHEET {RULE 26) 



wo 02/02597 



PCT/DK01/004S9 



58 

PRODUCTION OF WT GCB 

A human fibroblast cDNA library was obtained j&om Clontech (Human fibroblast sJdn cDNA 

5 cloned in lambda-gtll, ca1# HL1052b). Lambda DNA was prepared from the library by 
standard methods and used as a template in a PGR reaction with either S049 and SO50 as 
primesr (amplifies Ihe GCB coding legion with the human signal peptide jEcom the second ATG) 
or SO50 and SOSl as ipnmex (ampUfies the mature part of the GCB coding region) (see Table 
1 in the Materials section). 

10 The PGR products were reamplified witii the same primers and agarose gel purified. 

Subsequently the SO49/50 PGR product was digested with BgUI and EcoRI and cloned into 
the pBlueBac 4,5 vector (InVitrogenhivitrogen, Carlsbad, GA, USA, Carlsbad, CA, USA) 
digested with BamHI and EcoRI. Sequencing coniiimed that the insert is identical to the 
wtGCB sequence as given in SEQ ID NO 2. The resulting plasmid was used for infection of 

15 insect cells with the GCB being partly secreted from the cells due to the human signal sequence 
as described in Martin et al., DNA 7, pp. 99-106, 1988. The SO50/51 PGR product was 
digested with Saol and EcoRI and cloned into the pBlueBac 4.5 vector (IhVitrogenlnvitrogen, 
Carlsbad, CA, USA) digested with the same enzymes resultmg in the pGCBmat plasmid. Two 
different signal sequences were inserted upstream of the mature GCB codons in order to 

20 increase the secreted amount of enzyme. The baculovims ecdysteroid UDPgJucosyltransferase 
(egt) signal sequence (Murphy et al., Pcotehi Expression and Purification 4, 349-357, 1993) 
was inserted by annealling S052 and S053 (Table 1) and the human pancreatic lipase signal 
sequence (Lowe et al., J. Biol. Chem. 264, 20042. 1989) was inserted by annealling S054 and 
S055 (Table 1) and cloning Ihem into the Nhel and Sad digested pGCBmat plasmid Infection 

25 of Spode^terajrugiperda (Sf9) cells of the xesuldng plasmid was done according to the 
protocols from laVitrogenlnyitrogen, Carlsbad, CA, USA 

Purification of GCB polypeptides produced in insect cells 

Polypeptides wilh GCB activity were purified as described in US 5,236,838, with some 
30 modifications. Cells were removed from Ihe culture medium by centrifugation (10 min at 4000 
rpm in a SorvaU RC5C centdfiige) and the supernatant microfiltrated using a 0.22 fim filter 
prior to purification. DTT was added to 1 mM and the culture supernatant was ultcafiltrated to 
approximately 1/10 of the starting volume using a Vivaflow 200 system (Vivascience). The 
concentrated media was centdfiiged to remove possible aggregates before application on a 
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Toyopearl ButyI650C resin (TosoHaas) previously equilibrated in 50 mM sodium citrate, 20 % 
(v/v) ethylesne glycol, 1 mM DTT, pH 5.0. This chromatographic step was peaformed at room 
tranperatore. Tha resin was washed with at least 3 column volumes of 50 mM sodium citrate, 
20 % (v/v) ethylene glycol, 1 mM DTT, pH 5.0 (until the absorbance at 280 nm reaches 
5 baseline level) and GCB was eluted with a linear gradient from 0% to 100% 50 mM sodimn 
citrate, 80% (vAr) ethylene glycol, 1 mM DTT, pH 5.0. Fractions were collected and assayed 
for GCB activity using the GCB Activity Assay. Usually, wt GCB starts to elute at approx. 
70% (v/v) ethylene gjycol. 

The subsequent purification was done by either of the following two methods. #2 
10 method results in GCB of a higher purity. 

Method #1 

GCB enriched ftactioDS iirom the first process step were pooled and diluted apptox. 4 times 
with a buffer containing 50 mM sodium citrate, 5 mM DTT, pH 5.0 to reduce the ethylene 

15 glycol content to 20% (or lower). In the second HIC purification step the diluted and partially 
purified GCB was applied on a Toyopearl phenyl resin (TosoHaas) equilibrated in 50 mM 
sodium citrate, 1 mM DTT, pH 5.0 (Buffer A) before use. After application, the resin was 
washed with at least 3 column volumes of 50 mM sodium citrate, pH 5 (until the absorbance at 
280 nm reaches baseline level) and GCB was then eluted with a linear ethanol gradient from 

20 0% to 100% buffer B (50 mM sodium citrate, 50% (v/v) ethanol, 1 mM DTT, pH 5.0). K^y 
purified fractions of GCB (wildtype > 95% pure), identified using the GCB Activity Assay, 
start to elute at approx. 40% ethanol. The purified GCB bulk product was dialyzed against 50 
mM sodium citrate, 0.2 M mannitol, 0.09% tweenSO, pH 6.1 to retain the GCB activity upon 
subsequent storage at 4-8''C or at -SO^C. 

25 

Method #2 

GCB enriched fractions eluted from the Toyopeari butyl650C resin were pooled and applied at 
4°C on a SP sepharose resin (Amersham Pharmacia Biotech) pirc\dously equilibrated in 25 
mM sodium citrate, 1 mMDTT, 10% ethylene glycol, pH 5.0. Afler application, the resin was 
30 washed wift 25 niM sodium citrate, 1 mM DTT, 10% efliylene glycol, pH 5.0 (until absorption 
at 280 nm reached baseline level) and GCB was flien eluted with a linear gradient from 0 
tol00% 0.25 M sodium citrate, 1 mM DTT, 10% ethylene glycol, pH 5.0. GCB begins to dute 
around 0.15 M sodium citrate. Fractions containing GCB were pooled and applied at room 
temperature onto a Phenyl sepharose High Performance (Pharmacia Biotech) previously 
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equilibrated in 25 mM sodium citrate 1 mM DTT, pH 5.0. After application, the resin was 
washed with 25 mM sodium citrate 1 mM DTT, pH 5.0 until absorption at 280 nm reached 
baseline level, and GCB was then eluted with a linear ethand gradient from 0 tol00% 25 mM 
sodium citrate 1 nxM DTT 50 % ethanol pH 5.0. GCB typically elutes around 35 % ethanol. 
5 The purified GCB bulk product was dialyzed against either 50 mM sodium citrate, 1 mM DTT, 
pH 5.0 or 50 mM sodium citrate, 0.2 M mannitol, 1 mMDTT, pH 6.1 to retain the GC3B 
activity upon subsequent storage. The purified GCB was oonoentrated and stedlfiltreied before 
storage at 4 - 8°C or at -80°C. Typically, GCB purified by this method is >95% pure. 

10 . 

EXAMPLE2 

Preparation of GCB with N-terminal pqjtide additions usmg a site-directed or random 
mutagenesis approach 

15 

Nucleotide sequences encoding the following N-terminal peptide additions were added to the 
nucleotide sequence shown in SEQ ID NO 2 encoding wtGCB: (A-4)+(N-3)+(I-2)+(T-l) 
(representing an extension to the N-terminal of the amino acid sequence shown in SEQ ID NO 
1 with the amino acid residues AMT), and (A-7)+(S-6)+0P-5)+(I-4)+(N-3)+(A-2)+Cr-l) 
20 (ASPINAT). 

A nucleotide sequence encoding the N-teiminal peptide addition (A-4)+(N-3)+(I- 
2)+(T-l) was prepared by PGR using the following conditions: 

PCRl: 

Template: 10 ng pBlueBacS with wt GCB cDNA sequence 
25 primer SO60; 5'-CAGCTGGCCATGGGTACCCGG-3' and 
primer S085: 

5'-TGGGCATCAGGTGCCAACATTACAGCCCGCCCCTGCATCCCTAAAAGC-3' 
BIO-X-ACT'^ DNA polymerase (Bioline, London, U£.) 
IxOptiBuffer™* (Bioline, London, U.K.) 
30 30 cycles of 96''C 30s, 55°C 30s, 72°C 1 min 
PGR 2: 

Template: 10 ng pBliieBac5 with wt GCB, 

Baculo virus forward primer: 5'-TITACrGTTTTCGTAACAGTTTTG-3* and 
PrimerS086: 
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5'- GCAGGGGCGGGCTGTAATGTTGGCACXTGATGCailACGACACI^ 
BIO-X-ACT™ DNA polymerase (Bioline, London, U.K.) 
IxOptiBuffer™ (Bioline, London, UX) 
30 cycles of 96°C 30s, 55°C 30s, n°C 1 min 
5 PGR 3: 

3 Ml of agarose gel purified PCRl and PCR2 products (app. 10 ng) 

Baculo virus forward primeK 5'-TTTACTGTXTTCX5TAACAGTnTG-3' and ptimea: SO60, 
BIO-X-ACT™ DNA polymerase (Bioline, London, U.K.) 
IxOptiBuffer™ (Bioline, London, U.K.) 
10 30 cycles of 96''C 30s, 55°C 30s, 72°C 1 min 

PGR 3 was agarose gel purified and digested with Nhd and Ncol and cloned into 
pBluebac4.5+wtGCB digested with ^[heI and Ncol. 

After confirmation of the correct mutations by DNA sequencing the plasmid was 
transfected into insect cells using the Bac-N-Blue™ transfection Idt from Invitrogen, Carlsbad, 
15 CA, USA. Expression of the muteins was tested by westem blotting and by activily 
measurement of the muteins using the GCB Activity Assay. 

Enzymatic activity of wtGCB (SEQ ID NO 1) expressed in the expression vector 
pVL1392 in insect cells (Sf9) using an analogous method to that described in Example 1 gave 
13 units/L, while the N-tetminal peptide addition ASPINAT gave 28.5 units/L. 

20 

Construction of libraries of GCB with N-tetminal peptide addition 
Using random rautagpnesis two different libraries were constructed on the basis of GCB 
polypeptides with an N-terminal extension - library A with an. N-terminal extension encoding 
the following amino acid sequence AXNXTXN3CrXNXT, and library B with an N-tenninal 
25 extension encoding ANXT^DC[NXT. 

Primers for library A were designed: 
S0167: 5'- 

GTGTCGTGGGCATCAGGTGCCNN(G/QAMCyT)Cr/A/G)N(G/C)AC(AmC)(J/A/G)^^^ 
30 C)AA(CAr)Cr/A/G)N(G/C)AC(AAr/C)Cr/A/G)N(G/C)AA(Cn')Cr/A/G)N(G^C)^^ 
CCGCCCCTGCATCCCTAAAAGC 
S0168 : S'-GGCACCTGATGCCCAOGACACTGCCTG 

Primers for library B were designed using trinucleotides in Hie random positions. 
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X is a mixtore of trinucleotide codons for all natural amino acid residues, except proline. The 
tiinucleotide codons used were the same as descnbed by Kayushin et al.. Nucleic Acids 
Research, 24, 3748-3755, 1996. 

5 S0165:5'- 

CGTGGGCATCAGGTGCCAAC(X)AC(An"/C)AA(CAr)(X)ACXA/r/QAA 

OGCCCGCCCCTGCATCCCTAAAAGC 

S0166 : 5'- GTTGGCACCTOATG<XCAC3GAC^CrrGCCrG 

10 For both Kbraries: 

SO60 and pBRlO: 5'- TTT ACT GTT TTC GTA ACA GTT TTG 

In all PGR reactions BIO-X-ACT™ DNA polymerase (Bioline, London, U.K.) and 
l*Optibuffer™ (Bioline, London, U.K,) were used. The PGR conditions were 30 cycles of 
15 94°C 30s, 55°C 1 min, and 72°C 1 min. 

Templates and primers used for preparing a nucleotide sequence encoding the N-tenninal 
extension by the above PGR were as follows: 

PGR lA: 
20 Template: pGC12 
Piimexs: SO60 + S0167 

PGR IB: 

Template; pGC12 
25 Primers: SO60 + SO165 

PCR2A: 
Template: pGC12 
Primers: S0168 + pBRlO 

30 

PCR2B: 

Template: pGC12 
Primers: S0166 +pBR10 
PCR3A: 
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Tenqflate: 1 pi of agarose gel purified PGR 1 A and 2A products 
Primers: SO60+pBR10 



PCR3B: 

5 Template: 1 ^1 of agarose gel purified PGR IB and 2B products 
Primers: SO60+pBR10 



PGR 3 A and 3B were agarose gel purified and digested with Nhel and Ncol and ligated into 
pGC-12 digested with Nhel and Ncol. The ligation mixture is transformed into comqpetent E. 
10 colL The diversity of the hTiraiy was examined by DNA sequencmg of different E. coli clones 
and gave rise to the following amino acid sequences: 



library A: 

1: AFNXTLNKTWNCF/L)T 
15 2: TIVDSINTWNWTWNWT 
3: -EXTwt 

4: ALNSTGNLTVDGT 

5: ASNSTFNLTENLT 

6; TRNVnNCTUNST 
20 7: -KJCTwt 

8: ALNWTYNGTKNVT 

9: AANWTVNFTONFT 

10: -EXT wt 

11: AXNXTVNSTUNVT 
25 12: ANNFITNGTLNLT 

13: AGNWTANVTVNVT 

14:AGNSTSNVrGNWT 

15: AVNSTMNBEIAIPP ( 1 deletion - nonsens) 

16:AGNGTVNGTINGT 
30 17: AVNSTGNXTGNWr 

18: AGNGTUNGTSNLT 

19: -EXT wt 

20: AMNSTKNSTLNir 

21: AIWYTSKNST 
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22: -EXT wt 

23: AVNATMNWTANGT 
24:ASNSTNNGTLNAT 
25: AIUSTOICOTTINLT 
5 26: AP^I^^J^IDTVN^IT 
27: AQNKTPNFTMNCT 
28: AINVTWNCELNLT 
29: AIJNTIWrNLT 

10 Library B: 

1: AmTNFTNEr 

2: ANWTNRTNCT 

3: ANWTNFTNWT 

4: PTGUGTNFT 
15 5: ANWTNKTNKT 

6: ANNTNLTNAT 

7: ANYIWIOTT 

8: ANTTNQTNDT 

9: -EXTwt 
20 10:AmTNWTNTT 

11: FTATNHrNST 

12: -EXTwt 

13:ANWTNQTNQT 

14: ANWTNWTNAT 
25 15: ANFTNKTNMT 

16: ANHTNETNAT 

17: AN(C/W)TNFrNET 

18: ANLDKLHKUH (insertion ~ nonsens) 

19: ANCFTNQTNFT 
30 20: ANWTNWTNEWT 

21: ANCmWTNCT 

22: -EXTwt 

23: -EXTwt 

24: CHPYNWTNWT 
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25: ANETNYTNBT 
26: ANWTNWT 

27: AKPYKSYKFY (insesrtioii-nonsens) 
28: ANrTNKTNWT 

5 29: A^rw^m(rrNI^ 

30: ANNTNRTNFr 
31:ANWTNWTNWT 
32: ANWRTNHTNKT 
33:-EXTwt 
10 34: ANQTNIINWT 

Library B was traasfected into insect cells using the Bac-N-Blue™ transfectlon kit 
from Invitrogen, Carlsbad, CA, USA. First, 96 plaques from library B were picked and tested 
by activity measurement (GCB Activity Assay). Plaques were selected as follows: 3 with hi^ 
15 activity, 3 with medium activity and 3 with low or tio activity, and vims was purified for DNA 
sequencing resulting in the following amiiio add sequences: 
High activity: 
1-1; Mixed sequence 

1- 2: ANFINVATNQT 
20 ^3: (A)CN>nXLTNCK)T 

Medium activity: 

2- 1: ANKTN(S/C)TNir 

2- 2: Mixed sequence 
25 2-3: ANWrNCTNCDT 

Low activity: 

3- 1: ANWrN(F/L)TNWT 
3-2: CQLDURSTNBr 

30 3-3: No sequence 

From both libraries 96 plaques were picked and tested by activiJy measurement (GCB Activity 
Assay). From each library 6 plaques with high activity were selected and viras were purified 
for DNA sequencing. The amino add sequence encoded by the different clones were: 
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lihraty A: 
l:Nfixed sequence 
2: Mixed seance 
5 3: Mixed segience 
4:WT 

5: ANNTlSrmJWT 
6;ANNTNYTNWT 

10 library B: 

1: AANOTUNWTVNCr 
2: ATNOnOiryTANTT 
3:WT 

4: AANSTGNiriNGT 
15 5: AVNWTSNDTSNST 

GCB polypeptides of the invention were tested for various properties, including GCB activity, 
stability in J774E cells and uptake in J774E cells. Unless otherwise stated liie properties were 
tested by use of the methods described in the Methods section herein. 
20 In the below table the GCB activity of various GCB polypeptides of the invention is 

listed together with the activity of the positives firom Library A and B after plaque purification. 

. Table 2 



Activityafter 
#Otycosylstiim Plaqnelscib.t]im 
Flastmd Vector Mutations. sites iutiodnoed (U/L) 



pGC-1 PBlueBac4.5 Wt 


0 


6 


pGC-6 pBlueBac4.5 N-termANIT 


1 


3 


pGC-12 pVL1392 


Wt 


0 


13 


pGC-ls" pVL1392 


N-tennASPINAT 


1 


29 


pGC-36 pVL1392 


N-teon: ASPINATSPINAT 


2 


16 


pGC-38 pVL1392 


N-teniK ASPINATJC194N, K321N 


3 


16 


pGC40 pVL1392 


N-tem: ASPINAT,T132N, K293N, V295T 


3 


3.5 


pGC-47 pVL1392 


N-tenn: AGNGTVNGTINGT 


3 


30 


pGC-48 pVL1392 


N-term: ASNSTNNGTLNAT 


3 


36 
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paC-56 pVL1392 


N-tenn: ASPDSTATSPINAT, K194N, K321N 


4 


24 


pGC-57 pVU392 


N-term: ASPINAT. T132N, K194N, K32XN 


4 


20 




N-tennt ASPINAT T132N Ki94N 


3 


10 


pGC-60 pVL1392 


N-term: ANNTNYTNWT 


3 


P2: 14 


pGC-61 pVL1392 


N-tenn; ATNITLNYTANTT 


3 


P2:38 


pGC-62 pVL1392 


N-tem: AANSTGNTTINGT 


3 


P2:3S 


pGG^ pVH392 


N-tenn: AVNWTSNDTSNST 


3 


P2:66 


pGCrGS pVL1392 


AN N-tenn eactension + R2T 


1 


37 



Table 2: The plasnrid colunm shows tiie number of the GCB polypeptide. The vector coluiim 
shows the plasmid vector used for expression of the polypeptide. The mutation coiumn shows 
the amino acid exchanges of the GCB polypeptide. N-tenninal extentions ate described as N- 
5 term followed by the amino acid residues that makes up tiie extension. The Activity column 
gives the units per liter of GCB activity measured by the GCB Activity Assay on the 
supernatant from Sf9 insect cells infected with one single plaque and grown in 3 ml of media in 
a 6-well plate. Those labelled witii P2 are activity measured of supernatant from virus infection 
cells grown in 15 ml T7S Oasks. 

10 

Tables 



GCB polypeptide 


Vmax 


Km 


tVildtype . 


0.57 


87.7 


Cerezyme 


0.52 


91.9 


pGC36 


0.60 


70.6 


pGC38 


0.48 


44.0 


pGC56 


0.39 


32.2 


pGC60 


0.57 


79.1 


pGC61 


0.74 


100.5 


pGC62 


0.86 


110.8 


pGC63 


0.51 


83.1 



Table 3: Calculated Vmax and Km for uptake in the J774E macrophage cell line of flie 
different GCB polypeptides. Vmax and Km was calculated from dosis response curve (See Eg. 
1). The uptake of selected GCB polypeptides are shown in Figure 1 
15 As can be seen from table 3, an increase in Vnax was observed for the N-tenninaily 

extended GCB polypeptides OpGC60, pGC61, and pGC62). 
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EXAMPLES 

Glycosylation ofGCB polypeptides ofihe invention expressed in insect cells 

MALDI-TOF mass spectrometry was used to investigate the amount of carbohydrate 
attached to GCB polypeptides expressed in Sf9 cells. 

The 6 GCB polypeptide variants investigated all contained additional potential N- 
glycosylation sites con^ared to wtGC3. 

WtGCB contains 5 potential N-glycosylation sites of which only 4 are used. 

The 6 GCB polypeptide variants were: 
GC-36: ASPINATSPINAT-GCB, 
GC-38: ASPINAT-GCB(K194N^21N), 
GC-60: ANNTNYtNWT-GCB, 
GC-61: ATNTTLNYTANTT-GCB, 
GC-62: AANSTGNITINGT-GCB, and 
GC-63: AVNWTSNDTSNST-GCB. 

WtGCB: 

The theoretical peptide mass of wtGCB is 55 591 Da. WtGCB has 5 potential N-glycosylation 
sites of which only 4 are used. As the two most common N-glycan stractures on recombinant 
proteins expressed in Sf9 cells are ManaGlcNAczFac and ManaGlcNAci having masses of 
1038.38 Da and 892.31 Da, respectively, the expected mass of wtGCB canying 4 N-gJycans is 
between 59 159 Da and 59 743 Da. 

MAII>I-TOF mass spectrometry of WtGCB shows the broad peak typical of 
glycoproteins with a peak mass of 59.3 kDa in acoordanoe with the expected mass of wtGCB 
carrying 4 N-glycans. 

GC-36 (ASPINATSPINAT-GCB): 

The theoretical p^de mass of GC-36 is 56 829 Da. The N-tenninal extension contams two 
additional potential glycosylation sites at N5 and Nl 1 compared to wtGCB. Assuming that the 
WtGCB part of the variant is glycosylated like wtGCB, the variant has 6 potential N- 

glycosylation sites. 

As the two most common N-glycan structures on recombinant proteins expressed in Sf9 
cells are Man3GlcNAc2Fuc and MansGlcNAci having masses of 1038.38 Da and 892.31 Da, 
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respectively, the expected mass of GC-36 carrjdng 4 N-glycaiis is between 60 397 pa and 60 
981 Da, the expected mass of GC-36 carrying 5 N-glycans is between 61 289 Da and 62 019 
Da, and the expected mass of GC-36 carrying 6 N-gJycans is between 62 181 Da and 63 057 
Da. 

MALDI-TOF mass spectrometry of GC-36 shows a rather broad peak with a peak mass 
between 61.5 kDa and 62.9 kDa in accordance with the expected mass of GC-36 carrying 
either 5 or 6 N-glycans. 

N-tenninal amino acid sequence analysis of GC-36 showed that N5 is completely 
. glycosylated while Nil is partially glycosylated in complete agreement with the result 
obtained using mass spectrometry. 

GC-38(ASPINAT-GCB(K:194N,K321N)); 

The theoretical peptide mass of GC-38 is 56 217 Da. The N-terminal extension contains one 
additional potential glycosylation sites at N5 compared to wtGCB. In addition, the 
substitutions of Lys 194 and Lys3 21 with Asn-residues introduce two additional potential N- 
glycosylation sites. Assuming that the wtGCB part of Ihe variant is glycosylated like wtGCB, 
the variant has 7 potential N-glycosylation sites. 

Based on the same considerations as those used for GC-36, the expected mass of GC-38 
carrying 4 N-glycans is between 59 785 Da and 60 369 Da, the expected mass of GC-38 
carrying 5 N-glycans is between 60 677 Da and 61 407 Da, the expected mass of GC-38 
carrying 6 N-glycans is between 61 569 Da and 62 445 Da, and the expected mass of GC-38 
carrying 7 N-glycans is between 62 461 Da and 63 483 Da. 

MALDI-TOF mass spectrometry of GC-38 shows a major peak with a peak mass of 
63.1 kDa in accordance with the expected mass of GC-38 canying 7 N-glycans. la addition, a 
minor peak with a peak mass of 62.3 kDa is seen which corresponds to GC-38 cattying 6 N- 
^ycans. 

N-tMminal amino acid sequence analysis of GC-38 showed that N5 is completely 
glycosylated 

GC-60 (ANNTNYTNWT-GCB): 

The theoretical peptide mass of GC-60 is 56 770 Da. The N-tenninai extension contains three 
additional potential glycosylation sites at N2, N5 and N8 compared to wtGCB. Assuming that 
the WtGCB part of the variant is glycosylated Hke wtGCB, the variant has 7 potential N- 
glycosylation sites. 
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Based on the same considerations as those used for GC-36 the expected mass of GC-60 
canying 4 N-glycans is between 60 338 Da and 60 922 Da, the expected mass of GC-60 
canying 3 N-glycans is between 61 230 Da and 61 960 Da, the expected mass of GC-60 
carrying 6 N-glycans is between 62 122 Da and 62 998 Da, and the expected mass of GC-60 
caiiying 7 N-glycans is between 63 014 Da and 64 036 Da. 

MAIDI-TOF mass spectrometry of GC-60 shows two broad p^ks with peak masses of 
61.9 W>Bi and 62.8 kDa in accordance with the expected mass of GC-60 canying either 5 or 6 
N-gJycans. 

N-tenninal amino acid sequence analysis of GC-60 showed that N2 is mainly 
glycosylated, N5 is completely glycosylated while N8 is only seldom glycosylated in 
acceptable agreement with the result obtained using mass spectrometry. 

GC-61 (ATNmJSfYTANTT-GCB): 

The theoretical peptide mass of GC-61 is 56 970 Da. The N-tenninal extension contains three 
additional potential glycosylation sites atN3, N7 and Nil con^ated to wtGCB. Assuming that 
the WtGCB part of the variant is glycosylated Kke wtGCB, the variant has 7 potential N- 
glycosylation sites. 

Based on the same considerations as used for GC-36, the expected mass of GC-61 
carrying 4 N-glycans is between 60 538 Da and 61 122 Da, the expected mass of GC-61 
carrying 5 N-glycans is between 61 430 Da and 62 160 Da, the expected mass of GC-61 
carrying 6 N-glycans is between 62 322 Da and 63 198 Da, and the expected mass of GC-61 
carrying 7 N-glycans is between 63 214 Da and 64 236 Da. 

MALDI-TOF mass spectrometry of GC-61 shows a very broad peak with peak mass 
between 61.5 kDa and 63.0 kDa in accordance with the expected mass of GC-61 carrying 
either 5 or 6 N-glycans. 

N-tenninal amino add sequence analysis of GC-61 showed that N3 is completely 
glycosylated while N7 and Nl 1 are partially glycosylated in acceptable agreement with the 
result obtained using mass spectrometry. 

GC-62 (AANSTGNmNGT-GCB): 

Hie theoretical peptide mass of GC-62 is 56 806 Da. The N-terminal extension contains three 
additional potential glycosylation sites at N3, N7 and Nil compared to wtGCB. Assuming that 
the WtGCB part of the variant is glycosylated like wtGCB, the variant has 7 potential N- 
glycosylation sites. 
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Based on the same considerations as those used for GC-36, the expected mass of GC-62 
carrying 4 N-^ycans is between 60 374 Da and 60 958 Da, the expected mass of GC^2 
earning S N-glycans is between 61 266 Da and 61 996 Da, the expected mass of GC-62 
carrjdng 6 N-glycans is between 62 158 Da and 63 034 Da, and the expected mass of GC-^2 
carrying 7 N-glycans is between 63 050 Da and 64 072 Da. 

MALDI-TOF mass spectrometry of GC-62 shows two broad peaks with peak masses of 
61.6 kDa and 62.7 kDa in accordance with the expected mass of GC-62 carrying eith^ S or 6 
N-glycans. 

N-tenninal amino acid sequence analysis of GC-62 showed thatN3 is completely 
glycosylated while N7 and Nl 1 are partially glycosylated in acceptable agreement with the 
result obtained using mass spectrometry. 

GC-63 (AVNWTSNDTSNST-GCB): 

The theoretical peptide mass of GC-63 is 56 969 Da. The N-terminal extension contains three 
additional potential glycosylation sites at N3, N7 and Nil compared to wtGCB. Assuming that 
the WtGCB part of the vanant is glycosylated like wtGCB, the variant has 7 potential N- 
glycosylation sites. 

Based on the same considerations as those used for GC-36, the expected mass of GC-63 
carrying 4 N-glycans is between 60 537 Da and 61 121 Da, the expected mass of GC-63 
carrying 5 N-glycans is between 61 429 Da and 62 159 Da, the expected mass of GC-63 
carrying 6 N-glycans is between 62 321 Da and 63 197 Da, and the expected mass of GC-63 
carrjdng 7 N-glycans is between 63 213 Da and 64 235 Dal 

MAUDI-TOF mass spectrometry of OC-63 shows a major peak with a peak mass of 
61.9 kDa in accordance with the expected mass of GC-63 canying 5 N-glycans. ]h addition, a 
minor peak with a peak mass of 62.9 kDa is seen which conresponds to QC-63 carrying 6 N- 
glycans. 

N-tenninal amino add sequence analysis of GC-63 showed that N3 ans N7 are partially 
glycosylated. It was not possible to evaluate the glycosylation status of Nl 1. 

Fbrthermore, insect cell expressed N-terminally extended glycosylated polypeptide 
(GC-6 and GC-13) was subjected to N-tenninal amino acid sequence analysis (using Piocize 
from PE Biosystems, Foster City, CA). The sequencing cycle was blank for the Asn residue in 
botti ANir and ASHNAT N-tetminal peptide additions, demonstrating that the introduced 
glycosylation site is glycosylated. 
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Whai subjecting GC-13 to mass spectrophometry using the MALDI-TOF tBchniques 
on the Voyager DERP instrument (&om PE-Biosystems, Foster City, CA) the following results 
obtained: 

The wildtype and ASHNAT-extended wildtype expressed in insect cells gave average 
masses very close to the calculated mass of 59,727 Da and 61,421 Da, respectivdy, assuming 
that four glycosylation sites vifere occupied by the carbohydrates FucGlcNAcjMans. 

EXAMPLE4 

CfmxtracHon ofvlasmids for expression ofFSH 

5 A gene encoding the human FSH-alpha subunit was constructed by assembly of 

synthetic oligonucleotides by PGR usmg methods similar to the ones described in Stranmer et 
al. (1995) Gene 164, pp. 49-53. The native FSH-alpha signal sequence was maintained in order 
to allow secretion of the gene product. The codon usage of the gene was optimised for high 
expression in mammalian cells. Furthermore, in order to achieve high gene expression, an 

10 intron (ficom pCI-Neo (Promega)) was included in the 5' untranslated region of the gene. The 
synthetic gene was subcloned behind the CMV promoter in pcDNAS.l/Hygro {Invitrogen). 
The sequence of the resultmg plasmid, termed pBvdH977, is given in SEQ ID NO:3 (FSH- 
alpha-coding sequence at position 1225 to 1570). Similarly, a synflietic gene encoding the 
wildtype human FSH-beta subunit was constmcted. Also in this construct, the native signal 

IS sequence was maintained (except for a Lys to Glu mutation at position 2) in order to allow 
secretion, and the codon usage was optimised for high expression and an intron was included 
hi the recipient vector (pcDNA3.iyZeo (Invitrogen)). The sequaice of the resultmg FSH-beta - 
contaimng plasmid, termed pBvdH1022, is given in SEQ ID N0:4 (FSH-beta-coding sequence 
at position 1231 to 1617). A pJasmid contaimng both the FSH-a^ha and the FSH-beta 

20 encodmg synthetic genes was generated by subclonmg the FSH-alpha containing NnSrPvuOi 
fragment ftom pBvdH977 mto pBvdH1022 linearized with JVrwL The resulting plasmid, in 
which the FSH-alpha and FSH-beta-expression cassettes ace in direct orientation, was termed 
pBvdHllOO. 

25 Expression ofFSH in CHO cells 

FSH was expressed m Chinese Hamster Ovary (CHO) Kl cells, obtained firom the 
American Type Culture Collection (ATCC, CCL-61). 
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Far transient expression of FSH, cells were grown to 95% confluency in seitim- 
containing media (MEMa with ribonucleotides and deox;^bonucIeotides CUfc Technologies 
Cat # 32571-028) containing 1:10 EBS (BioWhittaJter Cat # 02-70IF) and 1:100 penidllin and 
streptomycin (BioWhittaker Cat # 17-602E), or Dulbeoco's NDBM/Nut-mix F-12 CHam) tr 
5 glutamine. 15 mM Hepes, pyridoxine-HCl (Life Technologies Cat # 31330-038) with the same 
additives. FSH-encoding plasmids were transfected into Ihe cells using lipofectamine 2000 
(Life Technologies) according to the manufacturer's specifications. 2448 hrs after 
transfection, culture media were collected, centrifuged and filtered through 0,22 micronieter 
filtei^ to remove cells. 

10 Stable clones expressing FSH were generated by transfection of CHO Kl cells with 

FSH-encoding plasmids followed by incubation of the cells in selective media (for instance one 
of the above media containing 0.5 mg/ml zeocin for cells transfected with plasmid 
pBvdHllOO). Stably transfected cells were isolated and sub-cloned by limited dilution. Clones 
that produced high levels of FSH were identified by BUS A. 

15 More specifically, the concentration of FSH in samples was quantified by use of a 

commercial immunoassay (DRG FSH EIA, DRG Instruments GmbH, Marburg, Germany). 
DRG FSH EIA is a solid phase immunosorbent assay (ELBA) based on the sandwich 
principle. The microtiter wells are coated with a monoclonal antibody directed towards a 
unique an^geoic site on ihe FSH-D subunit An aliquot of FSH-containing sample (diluted in 

20 H2O with 0.1% BSA) and an anti-FSH antiserum conjugated with horseradish peroxidase are 
added to the coated wells. After incubation, unbound conjugate is washed off with water. The 
amount of bound peroxidase is proportional to the concentration of FSH in the sample. The 
intensity of colour developed upon addition of substrate solution is proportional to the 
concentration of FSH in the sample. 

25 

Larsesccde production of FSH in CHO cells 

The cell line CHO Kl 1 100.-5, stably expressing human FSH, was passed 1:10 from a 
confluent culture and propagated as adherent cells in serum-containing medium Dulbecco's 
MElViyNut.-raix F-12 (Ham) I^glutamine, 15 mM Hepes, pyridoxine-HCl (Life Technologies 
30 Cat # 31330-038), 1:10 EBS (BioWhittaker Cat # 02-701F), 1:100 penicillin and streptomycin 
(BioWhittaker Cat # 17-602E) until confluence in a 10 layer cell factory (NUNC #165250). 
The media was then changed to serum-free media: Dulbecco's MEM/Nut.-mix F-12 (Ham) L- 
glutamine, pyridoxine-HCl (Life Technologies Cat # 21041-025) with the addition of 1:500 
irS-A (Gibco/BEL# 51300-044), 1:500 EX-CYTE VUE (Serological Proteins Inc. # 81-129) 
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and 1:100 penicillin and streptomycin CBioWhittakier Cat # 17-602E). Subsequently, eveiy 24 
h, culture media were collected and replaced with 1 fiesh liter of the same serum-free media. 
The collected media was filtered through 033. Dm filters to remove cells. Growth in cell 
j^tories was continued with daily harvests and replacements of the culture media unlil FSH 
5 yields dropped below one-fourth of the initial expression level (typically after 10-15 days), 

EXAMPLES 

10 Purification ofFSH wildtvpe and variants 

Iteee chromatographic stq)s were employed to obtain highly pxuified FSH. First an 
anion exchanger step, then hydrophobic interaction chromatography CHIC) and finally an 
inummoaffini^ st^ using an FSH-^ specific monoclonal antibody. 

Culture supematants were prepared as described in Example 4. Filtered culture 

IS supematants were concentrated 10 to 20 times by ultrafiltration (10 kD cut-off membrane), pE 
was adjusted to 8.0 and conductivity to 10 - 15 mS/cm, before application on a DBAE 
SepharosB (Phaimacia) anion exchanger colunm, which had been equilibrated in ammonium 
acetate buffer (0.16 M, pH 8.0). SemipuiifiedFSH was recovered both in the unbound flow- 
tiuDU^ fiaction as weill as in the wash fraction using 0.16 M ammonium acetate, pH 8.0. The 

20 fiow through and wash fractions were pooled and ammonium sulfate was added from a stock 
solution (4.5 M) to obtain a final concentration of 1.5 M CNIIi)2S04. The pH was adjusted to 
7.0. 

The partially purified FSH was subsequently applied on a 25 nil butyl Sepharose 
(Pharmacia) HIC colunm. After application, the colunm was washed with at least 3 column 

25 volumes of 1.5 M (NH4)2S04, 20 mM ammonium acetate, pH 7 (until the absorbance at 280 
run reached baseline level) and FSH was eluted with 4 column volumes of buffer B (20 mM 
ammonium acetate, pH 7). FSH enriched fractions from the HIC step were pooled, 
conoratrated and diafiltrated using Vivaspin 20 modules, 10 kD cut-off membrane 
(Vivascience), to a 50 mM sodium phospihate, 150 mM NaCl, pH 7.2. 

30 For the third chromatographic step, an anfi-FSH-P monoclonal antibody CRDI-FSH909, 

Researdi Diagnostics) was immobilized to CNBr-activated Sepharose (Phaimacia) using a 
standard procedure £com the supplier. Approximately 1 mg antibody was coupled per ml resin. 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



75 

The immuaoaffinity lesin was packed in plastic columns and equilibrated with 50 mM sodium 

phosphate, 150 mM NaQ, pH 7.2 before application. 

The buffer exchanged duate from the butyl HIC step was applied on the antibody 

column by use of gravity flow. This was followed by several washing steps in 50 mM sodium 
5 phosphate solutions (0.5 M NaCl and 1 M Nad, both pH 7.2). Blution was performed using 

either 1 M NH3 or 0.6 M NH3, 40% (v/v) isopropanol and the eluate was immediately 

neutralized with 1 M acetic acid to pH 6-8. 

The purified FSHbulk product was concentrated and diafiltrated using Vivaspin 20 

modules, 10 kD cut-off membrane (Vivascience), to a 50 mM sodium phosphate, 150 mM 
10 NaCl, pH 7.2. For subsequent storage, BSA was ^ded to 0.1% (w/v) and the purified FSH was 

nricrofiltrated using a 0.22 ]im filter prior to stora^ at - SO^C. 

SDS-PAGE, run under non-dissociating conditions (without boiling), showed wildtype 

FSH migrating as an apparant 42±3 kDa band, slightly difiEuse due to heterogeneity in the 

attached carbohydrates. The purity was about 80-90%. N-terminal sequencing showed that the 
15 cc-chain had the expected N-terminal sequence starting with residue 1 (SEQ ID N0:5) and the 

P-chain starting with residue 3 (SEQ ID N0:6). These N-tetminal sequences have been found 

previously for recoihbinant FSH produced in CHD cdls (Olijve, W. et al (1996) Mol Hum, 

JlcproA 2, 371-382). 

20 EXAMPIE6 

FSH in vitro activity assay 

6.1 FSH assay Outline 

It has previously been published that activation of the FSH receptor by FSH leads to an 
increase in the intracellular concentration of cAMP. Consequently, transcription is activated at 
25 promoters containing multiple copies of the cAMP response element (CRE). It is thus possible 
to measure FSH activity by use of a C3RE luciferase reporter gene introduced into CHO cells 
expressing the FSH receptor. 

6.2 Construction of a CHO FSH-R / CRE-luc ceU Ime 

30 Stable clones expressing the human FSH receptor were produced by transfection of 

CHO Kl ceUs with a plasndd containing the receptor cDNA inserted into pcDNAS 
(Invitrogen) followed by selection in media containing 600 microg/ml G418. Using a 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



76 

commercial cAMP-SPA RIA (Amrasham), clones were screened for the ability to respond to 
FSH stimulation. On the basis of these results, an RSH receptor-expressing CHO clone was 
selected for further transfection with a CRE-luc reporter gene. A plasmid containing the 
reporter gene with 6 CRE elements in ftont of the Rrefly lucifarase gene was co-transfected 
5 with a plasmid confetring Hygromycin B resistance. Stable cbnes were selected in the 
presence of 600 microg/ml G418 and 400 microg^ml Hygromycin B. A clone yielding a robust 
ludferase signal upon stimulation with FSH (ECso - 0.01 lU/ml) was obtained. This CHO 
FSH-R / CRE-luc cell line was used to measure the activity of samples containing ESH. 

10 6.3 ESH luciferase assay 

To perform activity assays, CHO FSH-R / CRE-luc cells were seeded in white 96 well 
culture plates at a density of about 15,000 cells/well. The cells were in 100 Dl DMElVW-12 
(wilhout phenol red) with 1.25% FBS. After incubation overnight (at ZTC, 5% CO2), 25 /il of 
sample or standard diluted in DMEM/F-12 (without phenol red) with 10% FBS was added to 

15 each well. The plates were further incubated for 3 hrs, followed by addition of 125 pX Luclite 
substrate (Packard Bioscience). Subsequently, plates were sealed and luminescence was 
measured on a TopCount luminometo: (Packard) in SPC (single photon counting) mode. 

EXAMP1E7 

20 

Construction and caialysis of a variant form of FSH contaxnine two N-lirAed f:lycosvlaticms at 
the N-terminus of the alpha mbtmit 

A construct encoding a modified form of FSH-alpha, having two additional sites for N- 
Imked glycosylation at its N-terminus was generated by site-dicected mutagenesis usmg 

25 standard DNA techniques known in the art. A DNA ficagmfidit encodmg the sequence Ala-Asn- 
ne-Thr-Val-Asn-Ile-Thr-Val was inserted immediately upstream of the mature FSH-alpha 
sequence in pBvdH977. The sequence of tiie resulting plasmid, termed pBvdH1163, is given in 
SEQ ID N0:7 (modified FSH-alpha-encoding sequence at position 1225 to 1599). A plasmid 
encoding both subunits was constructed by subcloning the FSH-oontaining JVral-Pvjdl 

30 fragment from pBvdHl 163 into pBvdH1022 (Stample 4), which had been linearized with 
PvuS^ The resulting plasmid was termed pBvdH1208, 

For expression of the variant form of FSH containing two N-linked glycosylations at 
the N-tamnus of the alpha subunit (termed FSH1208), CHO Kl cells were transfected with 
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pBvdH1208 or co-transfected wilh a combinatioii of pBvdH1163, encoding the modified alpha 
subunit and pBvdH1022, encoding the wildtype beta subunit. Transient explosions, isolation 
of stable expression clones, and large-scale production of FSH1208 were petfbnned as 
described for wildtype FSH in Example 4. 

5 The FSH content of samples was analysed by Western Blotting: Proteins were 

separated by SDS-PAOE and a standard Western blot was perfonned using rabbit anti hmnan 
FSH (AHP519, Serotec) or mouse anti human FSH-alpha (MCA338, Serotec) as primary 
antibody, and an ImmunoPure Ultra Sensitive ABC Peroxidase Staining Kit (Pierce) for 
detection. Western blotting showed that FSH1208 had a larger molecular mass than wildtype 

10 FSH, indicating that the introduction of acceptor sites for N-Unked glycosylation at the N- 
temoinus of the alpha subunit indeed lead to hyperglycosylation of FSH. For analysis of pi, 
samples were separated on pH 3-7 lEF gels (NOVEX). After electrophoresis, proteins were 
blotted onto Immobilon-P (Millipore) membranes and a Western blot was performed as 
described above, using the same antibodies and detection kit. Isoelectric focusing demonstrated 

15 that the FSH forms in the FSH1208 samples were found in a lower pi range than wildtype 
FSH. Thus, the pH interval for FSHl 208 isofonns was about 3 .0-4.5 versus about 4.0-5 .2 for 
wildtype FSH. This indicated that FSH1208 molecules are on average more negatively charged 
than the wild type, which is attributed to the presence of additional sialic acid residues. 

FSH1208 was purified and characterized as described in Example 5. SDS-PAGE, run 

20 under non-dissociating conditions (without boiling), showed FSH1208 migrating as an 
apparent S5±5 kDa band, slightly diffuse doe to heterogeneity in the attached carbohydrates. 
The purity was about 80-90%. N-tenninal sequencing showed that while the P-chain had the 
same N-teaminal sequence as wildtype FSH, the sequence of o-chain was in agreement with 
this subunit carrying the expected N-tenninal extension ANirVNlTV, in whidi both 

25 asparagines residues are glycosylated. 

Hie specific activity of FSH1208 was determined by measurement of the in vitro 
bioactivity (FSH lucrferase assay, Example 6) and fee FSH cootmt of the sanqites by ELBA. 
The specific activity of FSH1208 was found to be about one-third of that of fee wildtype 
reference. 

30 A pharmacokinetic study performed as follows: 

Immature 26-27 days old female Sprague-Dawley rats were injected i.v. with 3-4 
microg FSH, produced, purified and analyzed as described above. Subsequently, blood samples 
were taken at various time-points after injection. FSH concentrations in serum samples were 
determined by EOSA, as described above. 
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in vivo bioactivity of wildtype recanibinant BSH and variant f oims may be evaluated by 
the ovarian weight augmentation assay (Steelman and Pohley (1953) Endocrinology 53, 604- 
616). Findiennore, tihie ability of FSH and variant fonns to stimulate maturation of follicles in 
laboratory animals may be detected with e.g. ultrasound equipment. The experiment showed 
5 that 24 hours after mjection of equal amounts of wildtype FSH and FSHI208, Uie sera of 
FSH1208-treated animals contained more than 10 fold more remaimng immunoreactive 
material than the sera from animals treated with wildtype FSH. 

EXAMPLES 

10 Construction and anatvsis of other FSH variants containins additioml slveosylation sites 

Plasmids encoding variant forms of FSH-alpha and FSH-beta containing additional 
sites for N-linlsed glycosylation were generated by site-directed mutagenesis using standard 
DNA techniques known in the art. The following amino add sidsstitutions and/or insertions 
Yfess generated: 

15 FSHl 147: Amino add Tyr58 of mature FSH-beta altered to Asn 

FSH1349: N-terminus of mature FSH-alpha altered from APD QDC. . . to: APIN DTVNFT QDC 

FSH1354: N-terminus of mature FSH-beta altered firam NS CEL ... to: NS NirVNlTV CEL . . . 
Plasmids encoding the variant forms were transiently expressed in CHQ Kl cells as 
20 described in Exam|>le 4. Plasmids encoding FSH-alpha variants were co^transfected with a 
plasmid encoding wild-type FSH-beta and vice versa. 

Westem and isoelectric focusing were performed on culture media samples as 
described above. The variant fonns had hi^er molecular weights than the wild-type, indicatmg 
tiiat the additional acceptor sites for N-linked glycosylation had indeed been glycosylated. 
25 Furthermore, Isoelectric focusing showed that the different isoforms of the three FSH variants 
ware spread over a lower pi range than the wildtype. This strongly suggests that the variant 
forms had a higher sialic add content than the wildtype. 

In vitro FSH activities of the resulting media samples were analysed as described in 
Example 6.3. All three variant forms were able to stimulate the CHO FSH-R / CRE-luc ceUs, 
30 indicating that these variant FSH forms have retained significant FSH activity. 

While the foregoing invention has been described m some detail for purposes of clarity and 
miderstancfing, it will be clear to one skilled in the art ftom a reading of this disclosure that 
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various changes in form and detail can be made without departing fix)m the true scope of the 
invention. For example, all the techniques, methods, compositions, apparatus and systems 
described above may be used in various combinations. All publicatians, patents, patent 
applications, or other documents cited in this application are incorporated by reference in their 
5 entirety for all puiposes to the same extent as if each individual publication, patent, patent 
application, or other document were individually indicated to be incorporated by refetence for 
all purposes. 
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CLAIMS 

1. A glycosylated polypeptide compnismg the primary structure 
5 NH2-X-PP-COOH 
wheidn 

X is a peptide addition comprising or contributing to a glycosylation site, and 
Pp is a polypeptide of interest 
10 2, A glycosylated polypeptide comprising the primary structure NHy-PjrX-Py-COOH, 

wh^iein 

Px is an N-temmial part of a polypeptide Pp of interest, 
Py is a C-terminal part of said polypeptide Pp, and 
X is a peptide addition comprising or contributing to a gjycosylation site. 
15 3. The polypeptide according to claim 1 or 2, wheiein Pp is a mature polypeptide. 

4. The polypeptide according to claim 2 or 3, wherein P* is a non-sttuctuial N-terminal 
part of a mature polypeptide Pp, and Py is a structural C-4erminal part of said mature 
polypeptide, 

5. The polypeptide according to any of claims 1-4. wherein Pp is a native polypeptide. 
20 6. The polypeptide according to any of claims 1-5, wherein Pp is a variant of a native 

polypeptide. 

7. The polypeptide according to claim 6, wherein Pp comprises at least one introduced 
and/or at least one removed glycosylation site for a non-peptide moiety as compared to the 
corresponding native polypeptide. 
25 8. The polypeptide according to any of claims 1-7, wherein Pp is of mammalian origin. 

9. The polypeptide according to claim 8, wherein Pp is of human origin. 

10. The polypeptide according to any of claims 1-9, wherein Pp is a therapeutic 
polypeptide. 

11. The i>olypeptide aocoiding to any of claims 1-10, wheiein Pp is selected ftom the 
30 group consisting of an antibody cor antibody fragment, a plasma protein, an erythrocyte or 

thrombocyte protein, a cytokine, a growfli factor, a profihrinolytic protein, a protease inhitttor, 
an antigen, an enzyme, a Kgand, a receptor, or a hormone. 

12. The polypeptide according to any of claims 1-7, 10 or 1 1, wherem Pp is of 
microbial origin. 
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13. The polypeptide according to claim 12, wherein Pp is a microbial enzyme. 

14. The polypeptide accoarding to claim 13, wherein Pp is selected from the group 
consisting of protease, amylase, amyloglucosidase, pectinase, lipase and cutinase. 

15. Hie polypeptide according to any of claims 1-14, wherein X comprises 1-500 
5 amino acid residues. 

16. The polypeptide according to claim 15, wherein X comprises 2-50 amino acid 
residues, such as 3-20 amino add residues. 

17. The polypeptide according to any of claims 1-16, wherein X comprises 1-20, in 
particular 1-10 glycosylation ates. 

10 18. The polypeptide according to any of the preceding claims, wherein X comprises at 

least one glycosylation site within a stretch of 30 amino add residues, such as at least one 
within 20 amino acid residues, in particular at least one within 10 amino add residues, in 
particular 1-3 glycosylation sites. 

19. The polypeptide according to any of claims 1-18, whereni X comprises at teast two 
15 glycosylation sites, wherein two of said sites are separated by at most 10 amino add residues, 

none of which comprises a glycosylation site. 

20. The polypeptide according to any of claims 6-19, wherein tiie polypeptide Pp is a 
variant of a native polypeptide which, as compared to said native polypeptide, comprises at 
least one introduced or at least one removed glycosylation site. 

20 21 . The polypeptide according to claim 20, wherem the polypeptide Pp comprises at 

least one introduced glycosylation site, in particular 1-5 introduced glycosylation sites, 

22. The polypeptide according to any of claims 1-21, wherein X has an N residue in 
position -2 or-1, and Pp has a T or an S residue in position +1 or +2, respectively, the residue 
numbering being made relative to the N- terminal amino acid residue of Pp. 

25 23. The polypeptide according to any of claims 1-22, wherein X has the structure Xi-N- 

Xa-[T/S]-Z, wherein Xi is a peptide comprising at least one amino acid residue co: is absent, Xj 
is any amino acid residue different from a proline residue, and Z is absent or a peptide 
coiiq)rising at least one amino add residue, the N-tenrdnal amino acid residue of which is 
dif feraat from a proline. 

30 24. The polypeptide according to daim 23, wherein Xi is absent, Xa is an amino add 

residue selected from liie group consisting of I, A, G, V and S, and Z comprises at least one 
amino add residue, the N-terminal amino add residue of which is different from proline. 

25. The polypeptide according to claim 24, wherein Z is a peptide comprising 1-50 
amino add residues, preferably comprising 1-10 glycosylation sites. 
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26. The polypeptide according to claim 25, wherein Xi comprises at least one amino 
acid residue, X2 is an amino add residue selected ftom the group consisting of I, A, G, V and 
S, and Z is absent 

27. The polyp^tide according to claim 26, wherein Xi is a peptide comprising 1-50 
5 amino acid residues, preferaiblycon^rismg 1-10 glycosylation sites. 

28. The polypeptide according to any of claims 1-27, wherein X comprises a peptide 
sequence selected fiom the group consisting of INAfT/S], GNI[r/S], VNI|T/S], SNICT/S], 
ASNirr/S], Nirr/S], SPINAIT/S], ASPINACr/S], A]SaiT/S]ANI[T/S]A]SII, 
ANirr/SlGSNIIT/SlGSNICr/S], ESliI|T/S]VNI|T/S]V 

10 YNI[T/S]VNI[T/S]V, AFNI[T/S]VNI|T/S]V, AYNI|T/S]VNICr/S]V, APNDIT/S]VNI[T/S]V, 
ANIU/S], ASNSIT/S]NNG[r/S]LNA|T/S], ANH|T/S]NECr/S]NAIT/S], GSP1NA|T/S]. 
ASPINAIT/S]SPINA[T/S], ANN|T/S]NY[T/S]NW[T/S], ATM[T/S]LOTrr/S]AN|T/S]T, 
AANS[T/S]GNI[T/S]]NG[T/S], AVNW[T/S]SND[T/S]SNS[T/S], GNAfT/S], 
AVNW[T/S]SND[T/S]SNS[T/S], ANN[T/S]NYIT/S]NSfr/S], ANNTNYTNWT, 

15 ANI[T/S]VNI[T/S]V, ND[T/S]VNF[T/S] andM[T/S]VNI[T/S]V wherein [T/S] is dther aT 
or an S residue, preferably a T residue. 

29. The polypeptide according to any of claims 1-29, wherein the peptide addition X 
comprises the sequence NSTQNATA or ANLTVRNLTRNVTV. 

30. The polypeptide according to any of the preceding claims, further comprising an 
20 attachment group for a second non-^)eptide moiety, said attachment group being linked to the 

second non-peptide moiety. 

31. The polypeptide according to claim 30, wherein the non-peptide moiety is selected 
f/xrm the group consisting of a polymer molecule, a lipophilic group and an organic 
detivatizing agent. 

25 32. The polypeptide according to daim 30 or 3 1 , wherein the attachment group for the 

non-peptide moiety is one present on an amino acid residue selected from the group consisting 
of the N-tenninal amino acid residue, the C-teotminal amino add residue, lysine, cysteine, 
ar^nine, gjutamine, aspartic acid, glutamic add, serine, tyrosine, histidine, phenylalanine and 
tryptophan. 

30 33. The polypeptide according to any of daims 30-32, wherein the polypeptide Pp is a 

variant of a native polypeptide, whidi as compared to said native polypeptide, comprises at 
least one introduced and/or at least one removed attadunent group for the second non-peptide 
moiety. 
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34. The polypeptide according to daim 33, wherein the polypeptide Pp comprises at 
least one introduced attachment group, in particular 1-5 introduced attachment groups. 

35. The polypeptide according to any of the precedmg claims, which has a molecular 
weight of at least 67 kDa, in particular at least 70 kDa, 

5 36. The polypeptide according to any of the proceeding claims, which has at least one 

of the following properties relative to the polypeptide Pp, the properties being measured under 
comparable conditions: 

in vitro bioactivity which is at least 25% of that of the polypeptide Pp as measured under 
compaMble conditions, incie^d affinity fbr a mannose receptor or other carbohydrate 
10 receptors, increased serum half -life, increased functional in vivo half-life, reduced renal 
clearance, reduced immunogenidty, increased resistance to proteolytic cleavage, improved 
targeting to lysosomes, macarophages and/or other subpopulations of human cells, improved 
stability in production, 

improved shelf life, unproved formulation, e.g. liquid fbnaulation, improved purification, 
15 improved solubility, and/or improved expression. 

37. A nucleotide sequence encoding the polypeptide according to any of claims 1-36. 

38. A vector comprising the nucleotide sequence according to claim 37. 

39. A host cell transformed or transfected with a nucleotide sequence acoordmg to 
claim 37, or a vector according to claim 38. 

20 40. The host cell according to claim 39, which is a glycosylating host cell, 

41. The host cell according to claim 40, which is a mammalian ceU, an invertebrate cell 
such as an insect cell, a yeast cell or a plant cell, or a transgenic animal, 

42. A melhod of producmg the polypeptide according to any of claims 1-36, comprising 
culturing a host cells according to any of claims 39-41 under conditions pennitting expression 

25 of the polypeptide and recovering flie polypeptide ftom flie culture. 

43. A melhod of produdng a polypeptide according to any of claims 30-36 attached to 
a second non-peptide moiety, which metiiod comprises subjected the polypeptide to 
conjugation to the non-peptide moiety under conditions fbr the conjugation to take place. 

44. The method according to clahn 43, wherein the polypeptide is prepared by the 
30 method according to 42 or 43. 

45. A method of preparing a nucleotide sequence accordmg to claim 37, which method 
comprises 

a) subjecting a nucleotide sequence aicoding the polypeptide Pp to elongation mutagenesis. 



SUBSTITUTE SHEET (RULE 26) 



wo 02/02597 



PCT/DKOl/00459 



84 

b) expiessmg the biatated niideotide sequence obtained in step a) in a suitable host cell, 
optionally 

c) conjugating polypeptides expressed in step b) to a second non-peptide moiety, 

d) selecting polypeptides obtained in step b) or c) which comprises at least one oligosaccharide 
5 moiety and optionally a second non-peptide moiety attached to the peptide addition part of Ifae 

polypeptide, and 

e) isolating a nucleotide sequence encoding Ihe polypeptide selected in step 6). 

46. The method according to claim45, wWch further comprises screening polypeptides 
resulting from step b) or c) for at least one improved property, and wherein the selection step d) 

10 further comprises selecting polypeptideshaving such improved property. 

47. The method according to claim 45 or 46, wherein the elongation mutagenesis is 
conducted so as to enrich for codons encoding an amino add residue comprising a 
glycosylation site. 

48. The method according to claim 45 or 46, wherein the eloogation mutagenesis is 
15 conducted so as to enrich for codons required for introduction of an attachment group for a 

second non-peptide moiety. 

49. The method according to any of claims 44-48, which further comprises subjecting 
the part of the nucleotide sequence encoding Pp to mutagenesis to remove and/or introduce 
glycosylation sites and optionally amino acid residues comprising an attachment group for the 

20 second non-peptide moiety. 

50. The method according to any of claims 45-49, wherein the selection in step d) is 
performed so as to select a conjugate having at least one of the properties defined in claim 36. 

51. A method of producing a glycosylated polypeptide encoded by a nucleotide 
sequence prepared according to claims 45-50, wherein the nucleotide sequence encoding the 

25 polypeptide selected in step c) is expressed in a glycosylating host cell and the resulting 
glycosylated expressed polypeptide is recovered. 

52. A method of improving one or more selected properties of a polypeptide Pp of 
interest, which method comprises 

a) preparing a nucleotide sequence encoding a polypeptide with the primary structure 

30 

^f^^-X-Pp-C0OH. 
wherein 
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X is a peptide addition comprising or contdbuting to a glycosylation site that is capable of 
conferring tbs selected improved pioperty/ies to the polypeptide Pp, 

b) expressing the nucleotide sequence of a) in an suitable host cell, optionally 

c) conjugating the expressed polypeptide of b) to a second non-peptide moiety, and 
5 d) recovering the polypeptide resulting from step c). 

53. The method according to claim 52, wherein the polypeptide Pp and/or the peptide 
addition X is as defined in any of claims 1-40. 

54. The method according to claim 52 or 53, wherein the nucleotide sequence of step a) 
is prepared by subjecting a nucleotide sequence encoding the polypeptide Pp to random 

10 elongation mutagenesis. 

55. The method according to claim 54, wherein the random elongation mutagenesis is 
conducted so as to enrich for codons encoding an amino add residue comprising or 
contributing to a glycosylation site and/or an attachment group for the second non-peptide 
moiety. 

15 56. The method according to any of claims 52-55, wherem, in the preparation of the 

nucleotide sequence of a), the part of the nucleotide sequence encoding the polypeptide Pp is 
subjected to mutagenesis to remove and/or introduce a glycosylation site or to remove and/or 
introduce an attachment group for a second non-peptide moiety. 

57. The melhod according to any of claims 52-56, wherein tiie property/ies to be 

20 improved is/are selected from the properties defined in claim 37. 
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SEQTIBNCB LISTIHG 



<110> MAXZGEaSI APS 

<120> N-IERMINMiLY BXUEiNDED EOLYPBETIDBS 

<130> 0217WO210 

<170> Patentin Ver. 2.1 

<210> 1 
<21X> 497 
<212> PRT 

<213> Homo sapiens 
<220> 

■<221> MSODJRES 
<222> {495) 
<223> R or H 

<400> 1 

Ala Arg Pro Cys lie Pro Lys Ser Phe Gly O^yr Ser Ser Vsd Val Cys 
1 5 10 ■ 15 

Val Cys Asn Ala Tibr Tyr Cys Asgp Ser Phe Asp Pro Pro Thr Phe Pro 
20 25 30 

Ala Leu Gly Thr Phs Ser Arg Tyr Glu Ser Thr Arg Ser Gly Arg Arg 
35 40 45 

Met. Glu Leu Ser Met Gly Pro He Glia Ala Asn His lajr Gly "rhr Gly 
50 55 60 

Leu Leu Leu Thr Leu Gin Pro Glu Gin Lys Phe Gin Lys Val Lys Gly 
65 70 75 80 

Phe Gly Gly Ala Met Thr Asp Ala Ala Ala Leu Asn lie Leu Ala Leu 
85 90 95 

Ser Pro Pro Ala Gin Asn Leu Leu Leu Lys Ser Tyr Phe Ser Glu Glu 
100 105 110 

Gly He Gly Tyr Asn He He Arg Val Pro Met Ala Ser Cys Asp Phe 
115 120 125 

Ser He Arg Thr Tyr Thr Tyr Ala Asp Thr Pro Asp Asp Phe Gin Leu 
130 135 140 

His Asn Phe Ser Leu Pro Glu Glu Asp Thr Lys Leu lya He Pro Leu 
145 150 155 160 

He His Arg Ala Leu Gin Leu Ala Gin Arg Pro Val Ser Leu Leu Ala 
165 170 175. 

Ser Pro Trp Thr Ser Pro Thr Trp Leu Lys Thr Asn Gly Ala Val Asn 
180 185 190 
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Gly Lys Gly Ser Leu Lys Gly Gin Pro Gly Asp lie Tyr His Gin Thr 
195 200 205 

Trp Ala Arg Tyr Phe Val Lys Phe Leu Asp Ala Tyr Ala Glu His Lys 
210 215 220 

Leu Gin Plie Trp Ala Val Thr Ala Glu Asn Glu Pro Ser Ala Gly Leu 
225 230 235 240 

Leu Ser Gly Tyr Pro She Gin Qys Leu Gly Phe Thr Pro Glu His Gin 
245 250 255 

Arg Asp Phe He Ala Arg Asp Leu Gly Pro •SSar Lau Ala Asn Ser Thr 
260 265 270 

His His Asn Val Arg Leu Leu Met Leu Asp Asp Gin Arg Leu Leu Leu 
275 280 285 

Pro His Trp Ala Lys Val Val Leu Thr Asp Pro Glu Ala Ala Lys Tyr 
290 295 300 

Val His Gly He Ala vai His Trp Tyr lieu Asp Phe Leu Ala Pro Ala 
305 310 315 320 

Lys Ala Thr Leu Gly Glu Thr His Arg Leu Phe Pro Asn Thr Met Leu 
325 330 335 

Phe Ala Ser Glu Ala Cys val Gly Ser Lys Phe Trp Glu Qln Ser Val 
340 345 350 

Arg Leu Gly Ser Trp Asp Arg Gly Met Gin Tyr Ser His Ser He He 
355 360 365 

Thr Asn Leu Leu Tyr His Val Val Gly Trp Thr Asp Trp Asn Leu Ala 
370 375 380 

Leu Asn Pro Glu Gly Gly Pro Asn Trp Val Arg Asn Phe Val Asp Ser 

385 390 395 400 

Pro He lie Val Asp He OTar Lys Asp Thr Phe Tyr Lys Gin Pro Met 
405 410 415 

Phe Tyr His Leu Gly His Phe Ser Lys Phe He Pro Glu Qly Ser Gin 
420 425 430 

Arg val Gly Leu Val Ala Ser Gin Lys Asn Asp Leu Asp Ala Val Ala 
435 440 445 

Leu Met His Pro Asp Gly Ser Ala Val Val Val Val Leu Asn Arg Ser 
450 455 460 

Ser Lys Asp Val Pro Leu Thr He Lys Asp Pro Ala Val Gly Phe Leu 
465 470 475 480 

Glu Q3ir He Ser Pro Gly Tyr Ser lie His Thr Tyr Leu Trp Xaa Arg 
485 490 495 
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Gin 



<210> 2 

<211> 1551 

<212> UNA 

<213> Homo sapiens 

<400> 2 

atggctggca gcctcacagg attgcttcta cttcaggcag tgtcgtgggc atcaggtgcc 60 
cgcccctgca tccctaaaag cttcggctac agctcggtgg tgtgtgtctg caatgccaca 120 
tactgtgact cctttgaccc cccgaccttt cctgcccttg gtaccttcag ccgctatgag 180 
agtacacgca gtgggcgacg gatggagctg agtatggggc ccatccaggc taatcacacg 240 
ggcacaggcc tgctactgac cctgcagcca gaacagaagt tccagaaagt gaagggattt 300 
ggaggggcca tgacagatgc tgctgctctc aacatccttg ccctgtcacc ccctgcccaa 360 
aatttgctac ttaaatcgta ettctctgaa gaaggaatcg gatataacat catccgggta 420 
cccatggcca gctgtgactt ctccatccgc acctacaccb abgcagacac ccctgatgat 480 
ttccagttgc acaacttcag cctcccagag gaagatacca agctceiagat acccctgatt 540 
caccgagcac tgcagttggc ccagcgtccc gtttcactcc ttgccagccc ctggacatca 600 
cccacttggc tcaagaccaa tggagcggtg aatgggaagg ggtcactcaa gggacagccc 660 
ggagacatct accaccagac ctgggccaga tactttgtga agttccbgga tgcctatgct 720 
gagcacaagt tacagttctg ggcagtgaca gctgaaaatg agccttctgc tgggctgttg 780 
agtggatacc ccttccagtg cctgggcttc acccctgaac atcagcgaga cttaattgcc 840 
cgtgacctag gtcctaccct cgccaacagt actcaccaca atgtccgcct actcatgctg 900 
gatgaccaac gcttgctgct gccccactgg gcaaaggtgg tgctgacaga cccagaagca 960 
gctaaatatg ttcatggcat tgctgtacat tggtacctgg actttctggc tccagccaaa 1020 
gccaccctag gggagacaca ecgcctgttc eccaacacca tgctctttgc ctcagaggcc 1080 
tgtgtgggct ccaagttctg ggagcagagt gtgcggctag gctcctggga tcgagggatg 1140 
cagtacagcc acagoatcat cacgaacctc ctgtaccatg tggtcggctg gaccgactgg 1200 
aaccttgccc tgaaccccga aggaggaccc aattgggtgc gtaactttgt cgacagtccc 1260 
atcattgtag acatcaccaa ggacacgttt tacaaacagc ccatgttcta ccaccttggc 1320 
catttcagca agttcattcc tgagggctco cagagagtgg ggctggttgc cagtcagaag 1380 
aacgacctgg acgoagtggc attgatgcat cccgatggct ctgctgttgt ggtcgtgcta 1440 
aaccgcbcct ctaAggatgb gcctcttacc atcaaggabc ctgctgtggg ctbcctggag 1500 
acaatctcac cbggctactc cabbcacacc bacctgbggc gbcgccagtg a 1551 



<210> 3 

<211> 6186 

<212> DNA 

<213> Artificial sequence 



<222> (1225).. (1572) 

<223> Coding SQqaence for human FSH-alpha 



<400> 3 

gacggabcgg gagatctccc gatccccbat ggbcgactct cagbaeaatc bgcbcbgabg 60 

ccgcabagbt aagecagtat ctgctcccbg cttgbgbgtt ggaggbcgct gagbagbgcg 120 

cgagcaaaat btaagcbaca acaaggcaag gcbbgaccga caabbgcabg aagaatctgc 180 

btagggtbag gcgttttgcg ctgetbegcg atgbacgggc cagababacg cgtbgacabt 240 



wo 02/02597 



PCT/DKOl/004^ 



gattattgac 


tagttattaa 


tagtaatcaa 


tbacggggtc 


attagttcat 


agcccatata 


300 


tggagttccg 


cgttacataa 


cttacggtaa 


atggoccgcc 


tggctgaccg 


cccaacgacc 


360 


cccgcccatt 


gaogtoaata 


atgacgtatg 


ttcccatagt 


aacgccaata 


gggactttcc 


420 


attgacgtca 


atgggtggac 


tatttacggt 


aaactgccca 


cttggcagta 


catcaagtgt 


480 


atcatatgcc 


aagtacgccc 


cctattgacg 


tcaatgacgg 


taaatggccc 


gcctggcatt 


540 


atgcccagta 


catgacctta 


tgggactttc 


ctacttggca 


gtacatctac 


gtattagtca 


600 


tcgctattac 


catggtgatg 


cggttttggc 


agtacatcaa 


tgggcgtgga 


tagcggtttg 


660 


actcacgggg 


atttccaagt 


ctccacccca 


ttgacgtcaa 


tgggagtttg 


ttttggcacc 


720 


aaaatcaacg 


ggactttcca 


aaatgtcgta 


acaactccgc 


cccattgacg 


caaatgggcg 


780 


gtaggcgtgt 


acggtgggag 


gtctatataa 


gcagagctct 


ctggctaact 


agagaaccca 


840 


ctgcttactg 


gcttatcgaa 


attaatacga 


ctcactatag 


ggagacccaa 


gctggctagc 


900 


ttattgcggt 


agtttatcac 


agttaaattg 


ctaacgcagt 


cagtgcttct 


gacacaacag 


960 


tctcgaactt 


aagctgcagt 


gactctctta 


aggtagcctt 


gcagaagttg 


gtcgtgaggc 


1020 


actgggcagg 


taagtatcaa 


ggttacaaga 


caggtttaag 


gagaccaata 


gaaactgggc 


1080 


ttgtcgagac 


agagaagact 


cttgcgtttc 


tgataggcac 


ctattggtct 


tactgacatc 


1140 


cactttgcct 


ttctctccac 


aggtgtccac 


tcccagttca 


attacagctc 


ttaaaagctt 


1200 


ggtaccgagc 


tcggatccgc 


cacc atg gac tac tac cgc aag tac 
Met Asp lyr Tyr Arg Lys Tyr 


gcc gcc 
Ala Ala 


1251 



ate ttc ctg gtg acc ctg age gtg ttc ctg cac gtg ctg cac age gee 
lie Phe Leu val Tbr Leu Ser Val Phe Leu His Val Leu His Ser Ala 



ece gac gtg cag gac tge ccc gag tgc acc ctg cag gag aac cce ttc 
Pro Asp Val Gin Asp Cys Pro elu Cys Thr Leu Gin Glu Asn Pro Phe 



tte age cag ccc ggc gee ece ate ctg cag tge atg ggc tge tgc ttc 
Phe Ser Gin Pro Gly Ala Pro lie Leu Gin Cys Met Gly Cys Cys Phe 



age cgc gcc tac ccc acc ccc ctg cgc age aag aag acc atg ctg gtg 
Ser Arg Ala Tyr Pro Thr Pro Leu Arg Ser Lys Lys Thr Met Leu Val 



cag aag aac gtg ace age gag age aec tge tgc gtg gcc aag age tac 
Gin Asn Val Ibx Ser Glu Ser Tbr Cys Cys Val Ala Lys Ser Tyr 
75 80 85 

aac cgc gtg aec gtg atg gge ggc tte aag gtg gag aac cac aec gee 
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Asn Arg Val Thr Val Met Gly Gly PJie Lys Val Glu Asn His Tbr Ala 
90 95 lOD 105 



tgc cac tgc age acc tgc tac tac cac aag age taatctagag ggcccgttta 1592 
Cys His Cys Ser Thr Cys Tyr Tyr His Lys Ser 



110 




115 




aacccgctga tcagcctcga 


ctgfcgccttc 


tagttgccag ccatctgttg tttgcccctc 


1652 


ccccgtgcct tccttgaccc 


tggaaggtgc 


cactcccact gtcctttcct aataaaatga 


1712 


ggaaattgca tcgcattgtc 


tgagbaggtg 


tcattctatt ctggggggtg gggtggggca 


1772 


ggacagcaag ggggaggatt 


gggaagacaa 


tagcaggcat gctggggatg oggtgggctc 


1832 


tatggcttct gaggcggaaa 


gaaccagctg 


gggctctagg gggtatcccc acgcgccctg 


1892 


tagcggcgca ttaagcgcgg 


cgggtgtggt 


ggttacgcgc agcgtgaccg ctacacttgc 


1952 


cagcgcGcta gcgcccgctc 


ctttcgcttt 


cttcccttcc tttctcgcca cgttcgcegg 


2012 


ctttceccgt caagctctaa 


ateggggcat 


ccctttaggg ttcegattta gtgctttacg 


2072 


gcacetcgae cceaaaaaac 


ttgattaggg 


tgatggttca cgtagtgggc catcgccctg 


2132 


atagacggtt tttcgccctt 


tgacgttgga 


gtccacgttc tttciatagtg gactcttgtt 


2192 


ccaaactgga acaacactca 


aocctatctc 


ggtctattct tttgatttat aagggatttt 


2252 


ggggatttog gcctattggt 


taaaaaatga 


gctgatttaia caaaaattba acgcgaatta 


2312 


attctgtgga atgtgtgtca 


gttagggtgt 


ggaaagtccc caggctcccc aggcaggcag 


2372 


aagfcatgcaa agcatgcabc 


tcaattagtc 


cigcaaocagg tgtggaaagt ccccaggctc 


2432 


•cccagcaggc agaagtatgc 


aaagcatgca 


tctcaattag tcagcaacea tagtcccgcc 


2492 


cctaactccg cccatcccgc 


ccctaactcc 


gcccagttcc gcccattete cgccecatgg 


2552 


ctgactaatt ttttttattt 


atgcagaggc 


cgaggccgce tetgcetctg age tat tec a 


2612 


gaagtagtga ggaggctttt 


ttggaggcct 


aggcttttgc aaaaagetcc cgggagcttg 


2672 


tatatccatt ttcggatctg 


atcagcacgt 


gatgaaaaag cctgiiactca ccgcgacgtc 


2732 


tgtcgagaag tttctgatcg 


aaaagfctcga 


cagcgtctcc gacctgatgc agctctcgga 


2792 


gggcgaagaa tctcgtgctt 


tcagcttcga 


tgtaggaggg cgtggatatg tcctgcgggt 


2852 


aaatagctgc gccgatggtt 


tctacaaaga 


tcgttatgtt tatcggcact ttgcabcggc 


2912 


cgcffctcccg attccggaag 


tgcttgacat 


tggggaattc agcgagagcc tgaccbattg 


2972 


catctcccgc cgtgcacagg 


gtgtcacgtt 


gcaagacctg ccbgaaaccg aactgcccgc 


3032 


tgttctgcag ccggtcgcgg 


aggccatgga 


tgcgatcgct gcggccgatc ttagccagac 


3092 
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gagcjgggttc 


ggcccattog 


gaocgcaagg aatcggtcaa tacactacat ggcgtgattt 


3152 


catatgcgcg 


attgctgatc 


ccoatgtgta tcactggcaa actgtgatgg acgacaccgt 


3212 


cagtgcgtcc 


gtcgcgcagg 


ctctcgatga gctgabgctt tgggccgagg actgccccga 


3272 


agtccggcac 


ctcgtgcacg 


cggatttcgg ctccaacaat gtcctgacgg acaatggccg 


3332 


cataacagcg 


gtcattgact 


ggagogaggc gatgttcggg gattcccaat acgaggtogc 


3392 


caacatcttc 


ttctggaggc 


cgtggttggc ttgtatggag cagcagacgc gctacttcga 


3452 


gcggaggcat 


ccggagcttg 


caggatcgcG gcggctccgg gcgtatatgc tccgcattgg 


3512 


tcttgaccaa 


ctctatcaga 


gcttggttga cggcaatttc gatgatgcag cttgggcgca 


3572 


gggtcgatgc 


gacgcaatcg 


tccgatccgg agccgggact gtcgggcgta cacaaatcgc 


3632 


ccgcagaagc 
Gcgacgcccc 


gcggccgtct 
agcactcgtc 


ggaccgatgg ctgtgtagaa gtactcgccg atagtggaaa 
cgagggcaaa ggaatagcac gtgctacgag atttcgattc 


3692 
3752 


caccgccgcc 


btctatgaaa 


ggttgggctt cggaatcgtt ttccgggacg ccggctggat 


3812 


gatcctccag 


cgcggggatc 


tcatgctgga gttcttcgcc caccccaact tgtttattgc 


3872 


agcttataat 


ggttacaaat 


aaagcaatag catcacaaat ttcacaaata aagcattttt 


3932 


ttcactgcat 


tctagttgtg 


gtttgtccaa actcatcaat gtatottatc atgtctgtat 


3992 


accgtcgacc 


tctagctaga 


gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa 


4052 


ttgttatccg 


ctcacaattc 


cacacaacat acgagccgga agcataaagt gtaaagcctg 


4112 


gggbgcctaa 


tgagtgagct 


aactcacatt aattgcgttg cgctcactgc ccgctttcca 


4172 


gtcgggaaac 


ctgtcgtgcc 


agctgcatta atgaatcggc caacgcgcgg ggagaggcgg 


4232 


tttgcgtatt 


gggcgctctt 


ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 


4292 


gctgcggcga 


gcggtatcag 


ctcactcaaa ggcggtaata cggttatcca cagaatcagg 


43S2 


ggataacgca 


ggaaagaaca 


tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 


4412 


ggccgcgttg 


ctggcgtttt 


tccataggct ccgcccccct gacgagcatc acaaaaatcg 


4472 


acgctcaagt 


cagaggtggc 


gaaacccgac aggactataa agataccagg cgtttccccc 


4532 


tggaagctcc 


ctcgtgcgct 


ctcctgttcc gaccctgccg cttaccggat acctgtccgc 


4592 


ctttctccct 


tcgggaagcg 


tggcgctttc tcaatgctca ogctgtaggt atctcagttc 


4652 


ggtgtaggtc 


gttcgctcca 


agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 


4712 


ctgcgoctta 


tccggtaact 


atcgtcttga gtccaacccg gtaagacacg acttatcgcc 


4772 


actggoagca 


gccactggta 


acaggattag cagagcgagg tatgbaggcg gtgctacaga 


4832 
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gttcttgaag 


tggtggccta 


actacggcta 


cactagaagg acagtatttg gtatctgcgc 


4892 


tctgctgaag 


ccagttacct 


tcggaaaaag 


agttggtagc tcttgatccg gcaaacaaac 


4952 


Gaccgctggt 


agcggtggtt 


tttttgtttg 


caagcagcag attacgcgca gaaaaaaagg 


5012 


atctcaagaa 


gatcctttga 


tcttttctac 


ggggtctgac gctcagtgga acgaaaactc 


5072 


acgttaaggg 


attttggtca 


tgagattatc 


aaaaaggatc ttcaoctaga tccttttaaa 


5132 


ttaaaaatga 


agttttaaat 


caatctaaag 


tatatatgag taaacttggt ctgacagtta 


5192 


ccaatgctta 


atcagtgagg 


cacctatctc 


agogatctgt ctatttcgtt catccatagt 


5252 


tgcctgactc 


cccgtcgtgt 


agataactac 


gatacgggag ggcttaccat ctggccccag 


5312 


tgctgcaatg 


ataccgcgag 


acccacgctc 


accggctcca gatttatcag caataaacca 


5372 


gccagccgga 


agggccgagc 


gcagaagtgg 


tcctgcaact ttatccgcct ccatccagtc 


5432 


tattaattgt 


tgccgggaag 


ctagagtaag 


tagttcgcca gttaatagtt tgcgcaacgt 


5492 


tgttgccatt 


gctacaggca 


tcgtggtgtc 


acgctcgtcg tttggtatgg cttcattcag 


5552 


ctccggttcc 


caacgatcaa 


ggcgagttac 


atgatccccc atgttgtgca aaaaagcggt 


5512 


tagctccttc 


ggtcctccga 


togttgtcag 


aagtaagttg gccgcagtgt tatcactcat 


5672 


ggttatggca 


gcactgcaba 


attctcttac 


tgtcatgcca tccgtaagat gcttttctgt 


5732 


gactggtgag 


tactcaacca 


agtcattctg 


agaatagtgt atgcggcgac cgagttgctc 


5792 


ttgcccggcg 


tcaatacggg 


ataataccgc 


gccacatago agaactttaa satgtgctcat 


5852 


cattggaaaa 


cgttcttcgg 


ggcgaaaact 


ctcaaggatc ttaccgctgt tgagatccag 


5912 


ttcgatgtaa 


cccactcgtg 


cacccaactg 


atcttcagca tcttttactt tcaccagcgt 


5572 


ttctgggtga 


gcaaaaacag 


gaaggcaaaa 


tgccgcaaaa aagggaataa gggcgacacg 


6032 


gaaatgttga 


atactcatac 


tcttcctttt 


tcaatattat tgaagcattt atcagggtta 


6092 


ttgtctcatg 


agcggataca 


tatttgaatg 


tatttagaaa aataaacaaa taggggttcc 


6152 


gcgcacattt 


occcgaaaag 


tgccacctga 


cgtc 


6186 



<210> 4 

<211> 5651 

<212> DNA 

<213> Artificial seguence 
<220> 

<221> exon 

<222> (1231) .. (1617) 

<223> coding sequence for luiman PSH-b^ta 
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<400> 4 
gacggatcgg 


gagatctcGc 


gatcGcctat 


ggtcgactct 


cagtacaatc 


tgctctgatg 


60 


ccgcatagtt 


aagccagtat 


ctgctccctg 


cttgtgtgtt 


ggaggtcgct 


gagtagtgcg 


120 


cgagcaaaat 


ttaagctaca 


acaaggcaag 


gcttgaccga 


caattgcatg 


aagaatctgc 


180 


ttagggttag 


gcgttttgcg 


etgcttcgeg 


atgtacgggc 


cagatatacg 


cgbtgacatt 


240 


gattattgac 


tagttattaa 


bagtaatcaa 


fcbacggggfcc 


attagttcat 


agcccatata 


300 


tggagttccg 


ogttacataa 


cttacggtaa 


atggcccgcc 


tggctgaccg 


cccacu:gacc 


360 


cccgcccatt 


gacgtcaata 


atgacgtatg 


ttcccatagt 


aacgccaata 


gggactttcc 


420 


attgacgtca 


atgggtggac 


tatttacggt 


aaactgccca 


cttggcagta 


catcaagtgt 


480 


atcatatgcc 


aagtacgccc 


cctattgacg 


tcaatgacgg 


taaatggccG 


gcctggcatt 


540 


atgcccagta 


catgacctta 


tgggactttc 


ctacttggca 


gtacatctac 


gtattagtca 


600 


tcgctattac 


catggtgatg 


cggttttggc 


agtacatcaa 


tgggcgtgga 


tagcggtttg 


660 


actcacgggg 


atttccaagt 


cbccaccccfa 


ttgacgtcaa 


tgggagtttg 


ttttggcacc 


720 


aaaatcaacg 


ggactttcca 


aaatgtcgta 


acaactccgc 


cccattgacg 


caaatgggcg 


780 


gtaggcgtgt 


acggtgggag 


gtctatataa 


gcagagctct 


cbggctaact 


agagaaccca 


840 


ctgcttactg 


gcttatcgaa 


attaatacga 


ctcactatag 


ggagacccM 


gctggctagc 


900 


btattgcggt 


agbttatcac 


agtfcaaattg 


ctaacgcagb 


cagtgcttct 


gacacaacag 




tctcgaactt 


aagctgcagt 


gacbctctta 


aggtagcctt 


gcagaagttg 


gtcgtgaggc 


1020 


actgggcagg 


taagtatcaa 


ggttacaaga 


caggtttaag 


gagaccaata 


gaaactgggc 


1080 


ttgtcgagac 


agagaagact 


cttgcgtttc 


tgataggcac 


ctattggtct 


tactgacatc 


1140 


cactttgcct 


ttctctccac 


aggtgtccac 


tcccagbtca 


attacagctc 


ttaaaagctt 


1200 


ggtaccgagc 


tcggatcbat 


cgatgecacc 


atg gag acc ctg cag ttc ttc ttc 
Het Glu lin: Leu Gin She Fhe Ebe 
1 5 


1254 


ctg tte tgc tgc tgg as 


^g gcc ate 1 


:gc tgc aac 


age tgc gag ctg acc 


1302 



Leu Phe Cys Cys Trp Lys Ala lie cys Cys Asn Ser Cys Glu Leu Thr 
10 15 20 

aac ate acc ate gcc ate gag aag gag gag tgc cge tte tgc ate age 1350 
Ash He "tbx He Ala Xle Glu Lys Glu Glu Cys Arg Phe Cys He Ser 
25 30 35 40 

ate aac acc aec tgg tgc gcc ggc tac tgc tae ace cge gac etg gtg 1398 
He Asn Thr Trp Cys Ala Gly Tyr Cys Tyr Thr Arg Asp Leu Val 
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tac aag gac ccc gcc cgc ccc aag ate cag aag acc tgc acc ttc aag 
Tyr Lys Asp Pro Ala Arg Pro Lys He Gin Lys Tbr Cys Thr Phe Lys 



gag ctg gtg tac gag acg gtc egg gtg ccc ggc tgc gcc cac cac gcc 
Glu Leu val Tyr Glu Ttir Val Arg Val Pro Gly Cys Ala His His Ala 



gac age ctg tac ace tac ccc gtg gcc acc cag tgc cac tgc ggc aag 

Asp ser Leu Tyr Thr Tyr Pro Veil Ala Thr Gin Cys His Cys Gly Lys 
90 95 100 

tgc gac age gac age ace gac tgc ace gtg cgc gge ctg ggc ccc age 
Cys Asp Ser Asp Ser Thr Asp Cys Thr Val Arg Gly Leu Gly fro ser 
105 110 115 120 

tac tgc age ttc ggc gag atg aag gag taactcgaga ctagaggigcc 
Tyr Cys Ser Fhe Qly Glu Met Lys Glu 
125 



cgtttaaace 


egetgatcag 


eetegaetgt 


gcettetagt 


tgccagccat 


ctgttgtttg 


1697 


eccetcccce 


gtgecttect 


tgaecctgga 


aggtgccact 


occactgtcc 


tttcctaata 


1757 


aetatgaggaa 


attgcatcgc 


attgtctgag 


taggtgtcat 


tctattetgg 


ggggtggggt 


1817 


ggggcaggac 


agcaaggggg 


aggattggga 


agaeaatagc 


aggcatgctg 


gggatgcggt 


1877 


gggctctatg 


gcttctgagg 


cggaaagaac 


cagctggggc 


tctagggggt 


atccccacgc 


1937 


gccctgtage 


ggegcattaa 


gcgcggcggg 


tgtggtggtt 


acgcgcagcg 


tgaccgctac 


1997 


acttgccagc 


gccetagcgc 


ccgctccttt 


cgctttcttc 


ccttcctttc 


tcgccacgtt 


2057 


ogccggcttt 


cccegtcaag 


ctetaaatcg 


gggcatccct 


ttagggttce 


gatttagtge 


2117 


tttacggcae 


ctcgacccca 


aaaaacttga 


ttagggtgat 


ggttcacgta 


gtgggccatc 


2177 


gecetgatag 


acggtttttc 


gccctttgac 


gttggagtcc 


acgttcttta 


atagtggawjt 


2237 


cttgttccaa 


actggaacaa 


cactcaaccc 


tatcteggtc 


tattettttg 


atttataagg 


2297 


gattttgggg 


atttcggcet 


attggttaaa 


aaatgagctg 


atttaacaaa 


aatttaaegc 


2357 


gaattaatte 


tgtggaatgt 


gtgteagtta 


gggtgtggaa 


agtccceagg 


etccccaggc 


2417 


aggcagaagt 


atgcaaagca 


tgcatctcaa 


ttagtcagca 


accaggtgtg 


gaaagtcccc 


2477 


aggctcccca 


gcaggcagaa 


gtatgcaaag 


catgcatctc 


aattagtcag 


caaccatagt 


2537 


cccgccccta 


actccgccca 


tceegcccct 


aaetccgcec 


agttccgccc 


attctccgcc 


2597 


ceatggetga 


ctaatttttt 


ttatttatgc 


agaggccgag 


gccgcctctg 


cctctgagct 


2657 


attccagaag 


tagtgaggag 


gcttttttgg 


aggcctaggc 


ttttgeaaaa 


ageteeeggg 


2717 
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agcttgtata 


tccattttcg 


gatctgatca 


gcacgtgttg acaattaatc atcggcatag 


2777 


tatatcggca 


tagtataata 


cgacaaggtg 


aggaactaaa 


ccatggccaa gttgaccagt 


2837 


gccgttccgg 


tgctcaccgc 


gcgcgacgtc 


gccggagcgg 


tcgagttctg gaccgaccgg 


2897 


ctcgggttct 


cccgggactt 


cgtggaggac 


gacttcgccg 


gtgtggtccg ggacgacgtg 


2957 


accctgttca 


toagcgcggt 


ccaggaccag 


gtggtgccgg 


acaacaccct ggcctgggtg 


3017 


tgggtgcgcg 


gcctggacga 


gctgtacgcc 


gagtggtcgg 


aggtcgtgtc cacgaacttc 


3077 


cgggacgcct 


ccgggccggc 


catgaccgag 


atcggcgagc 


agccgtgggg gcgggagtfcc 


3137 


gccctgcgcg 


acccggccgg 


oaactgcgtg 


cacttogtgg 


ccgaggagca ggactgacac 


3197 


gtgctacgag 


atttcgattc 


caoogcogcc 


ttctatgaaa 


ggttgggctt cggaatogtt 


3257 


ttccgggacg 


ccggctggat 


gatcctccag 


cgcggggatc 


tcatgctgga gttcttcgcc 


3317 


caccccaact 


tgtttattgc 


agcttataat 


ggttacaaat 


aaagcaatag catcacaaat 


3377 


ttcacaaata 


aagcattttt 


ttcactgcat 


tctagttgtg 


gtttgtccaa actcatcaat 


3437 


gtatottatc 


atgtctgtat 


accgtcgacc 


tctagctaga 


gcttggcgta atcatggtca 


3497 


tagetgtttc 


ctgtgtgaaa 


ttgttatccg 


ctcacaattc 


cacacaacat acgagccgga 


3557 


agcataaagt 


gtaaagcctg 


gggtgcctea 


tgagtgagct 


aactcacatt aattgcgttg 


3617 


cgctcactgc 


ccgotttcoa 


gfccgggaaac 


ctgtcgtgcc 


agctgcatta atgaatcggc 


3677 


caacgcgcgg 


ggagaggcgg 


tttgogtatt 


gggcgctctt 


ccgcttcctc gctcactgac 


3737 


tcgotgcgct 


cggtcgttcg 


gotgcggcga 


gcggtatcag 


ctcactcaaa ggcggbaata 


3797 


cggttatcca 


cagaatcagg 


ggataacgca 


ggaaagacica 


tgtgagcaaa aggccagcaa 


3857 


aaggccagga 


accgtaaaaa 


ggccgcgttg 


ctggcgtttt 


tccataggct ccgcccccct 


3917 


gacgagcatc 


acaaaaatcg 


acgctcaagt 


cagaggtggc 


gaaacccgac aggactataa 


3977 


agataccagg 


cgtttccccc 


tggciagctcc 


ctcgtgcgct 


ctcctgttcc gaccctgccg 


4037 


cttacoggat 


acctgtccgc 


ctttotccct 


tcgggaagcg 


tggogctttc teaatgctoa 


4097 


cgotgtaggt 


atctoagttc 


ggtgtaggtc 


gttcgctcca 


agctgggctg tgtgoacgaa 


4157 


coccccgttc 


agcccgaccg 


ctgcgcctta 


tccggtaact 


atcgtcttga gtccaacoog 


4217 


gtaagacacg 


acttatcgcc 


actggcagca 


gcoactggta 


acaggattag cagagcgagg 


4277 


tatgtaggcg 


gtgctacaga 


gttcttgaag 


tggtggccta 


actacggcta cactagaagg 


4337 


acagtatttg 


gtatctgcgc 


tctgctgsiag 


ccagttacct 


tcggaaaaag agttggtagc 


4397 
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tcttgatccg 


gcaaacauc 


caccgctggt 


agcggtggtt 


tttttgtttg 


caagcagcag 


4457 


attacgcgca 


gaaaaaaagg 


atcbcaagaa 


gatcctttga 


tcttttctac 


ggggtctgac 


4517 


gctcagtgga 


acgaeiaactc 


acgttaaggg 


attttggtca 


tgagattatc 


aaaaaggatc 


4577 


ttcacctaga 


tccttttaaa 


ttaaaaatga 


agttttaaat 


caatctaaag 


tatatatgag 


4637 


taaacttggt 


ctgacagtta 


ccaatgctta 


atcagtgagg 


cacctatctc 


agcgatctgt 


4697 


ctatttcgtt 


catccatagt 


tgcctgactc 


Gccgtcgtgt 


agataactac 


gatacgggag 


4757 


ggcttaccat 


ctggccGcag 


tgctgcaatg 


ataccgcgag 


acccacgctc 


accggctcca 


4817 


gatttatcag 


caataaacca 


gccagccgga 


agggccgagc 


gcagaagtgg 


tcctgcaaot 


4877 


ttatccgcct 


ccatccagtc 


tattaattgt 


tgccgggaag 


ctagagtaag 


tagttogcca 


4937 


gttaatagtt 


tgcgcaacgt 


tgttgoeatt 


gctacaggca 


togtggtgtc 


acgctcgtcg 


4997 


tttggtatgg 


cttcattcag 


otccsggttoc 


caeu;gatcaa 


ggcgagttac 


atgatccccc 


5057 


atgttgtgca 


aaaeiagcggt 


tagetccttc 


ggtcotccga 


tcgttgtcag 


aagtiiagttg 


5117 


gccgcagtgt 


tatcactcat 


ggttatggca 


gcactgcata 


attctcttac 


tgtcatgcca 


5177 


tccgtaagafc 


gcttttctgt 


gactggtgag 


tactcaacca 


agtcattctg 


agaatagtgt 


5237 


atgcggcgac 


cgagttgctc 


ttgcccggcg 


tcaatacggg 


ataataccgc 


gccacatagc 


5297 


agaactttaa 


aagtgctcat 


cattggaaaa 


cgttcttcgg 


ggcgaaaact 


ctcaaggatc 


5357 


ttacogctgt 


tgetgatccag 


ttcgatgtaa 


cccactcgtg 


cacccaactg 


atcttcagca 


5417 


tottttactt 


tcaccagcgt 


ttctgggtga 


gcaaeiaacag 


gaaggcaaaa 


tgccgcaaaa 


5477 


aagggaataa 


gggcgacacg 


gaeiatgttga 


atactcatac 


tcttcetttt 


tcaatattat 


5537 


tgaagcattt 


atcagggtta 


ttgtctcatg 


etgcggataca 


tatttgaatg 


tatttagaaa 


5597 


aataaacaaa 


taggggttcc 


gcgcacattt 


ccccgaaaag 


tgooacctga 


cgtc 


5651 



<210> 5 

<211> 92 

<212> PRT 

<213> Homo sapiens 

<400> 5 

Ala Pro J^p Val Gin Asp Qys Pro Glu Cys Thr Leu Gin. Glu Asn Pro 
15 10 15 

Phe Phe Ser Gin Pro Gly Ala Pro He Leu Gin Cys Met Gly Cys Cys 
20 25 30 

Phe Ser Arg Ala Tyr Pro Thr pro Leu Arg Ser I<ys lya Thr Met Leu 
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35 40 45 

Val Gin Lys Asn. Val Thr Ser Glu Ser Thr Cys Cys Val Ala Lys Ser 
50 55 60 

Tyr Asn Arg Val Tfar val Met Gly Gly Pbs Lys Val Glu Aan His Thr 
65 70 75 80 

Ala Cys His Cys Ser Thr Cys Tyr "Cyx His lys Ser 
85 SO 



<210> 6 

<211> 111 

<212> PRT 

<213> Homo sapiens 

<400> 6 

Asn Ser Cys Glu Leu Thr Asn lis Tfar lie Ala Xle Glu Lys Glu Glu 
1 5 10 15 

Cys Arg Phe Cys He Ser He Asn Thr Thr Trp Cys Ala Gly Tyr Cys 
20 25 30 

Tyr Thr Arg Asp Leu Val Tyr Lys. Asp Pro Ala Arg Pro Lys He Gin 
35 40 45 

Lys Thr Cys Thr Phe Lys Glu Leu Val Tyr Glu Thr Val Arg Val Pro 
50 55 60 

Gly Cys Ala His His Ala Asp Ser Leu Tyr Thr Tyx: Pro Val Ala Thr 
65 70 75 80 

Gin Cys His Cys Gly Lys Cys Asp Ser Asp Ser Thr Asp Cys Thr Val 
85 90 95 

Arg Gly Leu Gly Pro Sar Tyr Cys Ser Phe Gly Glu Mfet Lys Glu 
100 105 110 



<210> 7 

<211> 6213 
<212> 

<213> Artificial sequence 
<220> 

<221> exon 

<222> (1225) {1599} 

<223> Coding sequence for modified FSH>-alpha 



<400> 7 

gacggatcgg gagatctccc gatcccctat ggtcgactct cagtacaatc tgctctgatg 60 
ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg 120 
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cgagcaaaat 


btaagctaca 


acaaggcaag 


gcttgaccga 


caattgcatg 


aagaatctgc 


180 


ttagggttag 


gcgttttgcg 


ctgcttcgcg 


atgtacgggc 


cagatatacg 


cgttgacatt 


240 


gattattgac 


tagttattaa 


tagtaatcaa 


ttacggggtc 


attagttcat 


agcccatata 


300 


tggagttccg 


cgttacataa 


cttacggt£ia 


atggcccgcc 


tggctgaccg 


cccaacgacc 


360 


cccgcccatt 


gacgtcaata 


atgacgtatg 


ttCGcafcagt 


aacgccaata 


gggactttCG 


420 


attgacgtca 


atgggtggac 


tatttacggt 


aaactgccca 


cttggcagta 


catcaagtgt 


480 


atcatatgcc 


aagtacgccG 


cctattgacg 


tcaatgacgg 


taaatggccc 


gcctggcatt 


540 


atgcccagta 


catgacctta 


tgggactttc 


ctacttggca 


gtacatctac 


gtattagtca 


6O0 


tcgctattae 


catggtgatg 


cggttttggc 


agtacatcaa 


tgggcgtgga 


tageggtttg 


660 


actcacgggg 


atttccaagt 


ctccacccca 


ttgacgtcaa 


tgggagtttg 


ttttggcacc 


720 


aaaatcaacg 


ggaetttcca 


aaatgtcgta 


acaactccgc 


cccattgacg 


caaatgggcg 


780 


gtaggcgtgt 


aoggtgggag 


gbctatataa 


gcagagctct 


ctggctaact 


agagaaccca 


840 


ctgcttactg 


gcttatcgaa 


attaatacga 


ctcactatag 


ggagacccaa 


gctggctagc 


900 


ttattgcggt 


agtttatcac 


agttaaattg 


ctaacgcagt 


cagtgcttct 


gacacaacag 


960 


tctcgaactt 


aagctgcagt 


gactctctta 


aggtagcctt 


gcagaagttg 


gtcgtgaggc 


1020 


actgggcagg 


taagtatcaa 


ggttacaaga 


caggtttaag 


gagaccaata 


gaaactgggc 


1080 


ttgtcgagac 


agagaagact 


cttgcgtttc 


tgataggcac 


ctattggtct 


tactgacatc 


1140 


cactttgcct 


ttctctccac 


aggtgtccac 


tcccagttca 


attacagctc 


ttaaaagctt 


1200 


ggbaccgagc 


tcggatccgc 


cacc atg gac tac tac cgc aag tac 
Met Asp Tyr Hyr Arg Lys Tyr 


goc gee 
Ala Ala 


1251 



1 5 

ate ttc ctg gtg acc ctg age gtg ttc ctg cae gtg ctg cae age gcc 1299 

lie Phe Leu Val Thr Leu Ser Val Ehe Leu Hia Val Leu His Ser Ala 
10 15 20 25 

aac ate aec gtt aac ate acc gtg gee cec gae gtg eag gac tge ccc 1347 
Asn lie Thr Val Asn lie Thr Val Ala Pro Asp Val Gin Asp Cys Pro 
30 35 40 

gag tgc acc ctg cag gag aac ccc ttc ttc age cag eec ggc gcc ccc 1395 
Glu Cys Thr Leu Gin Glu Asn Pro Phe Phe Ser Gin Pro Gly Ala Pro 
45 50 55 

ate ctg eag tge atg ggc tgc tgc tte age cge gee tac ccc acc cce 1443 
lie Lea Gin Cys Met Gly Cys Cys Phe Ser Arg Ala Tyr Pro Thr Pro 
60 65 70 

ctg ege age aag aag acc atg ctg gtg eag aag aac gtg acc age gag 1491 
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Leu Arg Ser Lys Lys Thr Met Iteu Val Gin Lys Asn, Val Thr Ser Glu 
75 80 85 

age acc tgc tgc gtg gcc aag age tac aac cgc gtg acc gtg atg ggc 1539 
Ser Thr Cys Cys Val Ala Lys Ser Tyr Asn Arg Val Thr Val Met Gly 
90 95 100 105 

ggc ttc aag gtg gag aac cac acc gcc tgc cac tgc age acc tgc tac 1587 
Gly Phe Lys Val Glu Asn His Thr Ala Cys His pys Ser Thr Cys Tyr 
110 115 120 

tac cac aag age taatctagag ggccogttta aacccgctga tcagcctcga 1639 
Tyr His Lys Ser 
125 



ctgtgccttc 


tagttgccag 


coatotgttg 


tttgcccctc 


ccccgtgcct 


tccttgaccc 


1699 


tggaaggtgc 


cactcccact 


gtcctttcct 


aataaaatga 


ggaaattgca 


tcgcattgtc 


1759 


tgagtaggtg 


tcattctatt 


ctggggggtg 


gggtggggca 


ggacagcaag 


ggggaggatt 


1819 


gggaagacaa 


tageaggeat 


gctggggatg 


cggtgggctG 


tatggcttct 


gaggeggaaa 


1879 


gaaccagctg 


gggctctagg 


gggtatcecc 


aegcgccctg 


tageggcgea 


ttaagcgcgg 


1939 


cgggtgtggt 


ggttacgegc 


agcgtgaccg 


ctacacttgc 


cagcgcccta 


gcgcccgctc 


1999 


ctttcgcttt 


cttcccttcc 


tttctcgcca 


CEpttcgccgg 


Gtttccocgt 


caagx:tctaa 


2059 


atoggggcat 


ccctttaggg 


ttccgabtba 


gtgotttacg 


gcacctcgac 


cccaaaaaac 


2119 


ttgattaggg 


tgatggttca 


cgtagtgggc 


catcgccctg 


atagacggtt 


tttcgccctt 


2179 


tgacgttgga 


gtccacgttc 


ttteiatagtg 


gactcttgtt 


ccaaactgga 


acaacactca 


2239 


accctatctc 


ggtctattct 


tttgatttat 


aagggatttt 


ggggatttcg 


gcctattggt 


2299 


taaaaaatga 


gctgatttaa 


caaaaattta 


aegegaatta 


attctgtgga 


atgtgtgtca 


2359 


gttagggtgt 


ggaaagtccc 


caggetccce 


aggcaggcag 


aagtatgcaa 


agcatgcatc 


2419 


teaattagtc 


ageaaccagg 


tgtggaaagt 


CGccaggctc 


cccagcaggc 


agaagtatge 


2479 


aaagcatgca 


tctcaattag 


tcagcaacca 


tagtcccgcc 


cctaactccg 


cccatcccgc 


2539 


Gcctaactcc 


gcccagttcc 


gcccattcbc 


cgccccatgg 


ctgactaatt 


ttttttattt 


2599 


atgcagaggc 


ogaggc cgcc 


totgoctctg 


agctattcca 


gaagbagtga 


ggaggctttt 


2659 


ttggaggcct 


aggcttttgc 


aaaaagctcc 


cgggagcttg 


tatatccatt 


ttcggatctg 


2719 


atcagcacgt 


gatgaaaaag 


cctgaactca 


ccgcgacgtc 


tgtcgagaag 


tttctgatcg 


2779 


aaaagttcga 


cagogtctcc 


gacctgatgc 


agctctcgga 


gggcgaagaa 


tctcgtgctt 


2839 


tcagcttcga 


tgtaggaggg 


cgtggatatg 


tectgcgggt 


aaatagctgc 


gccgatggtt 


2899 
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tctacaaaga tcgttatgtt tatcggcaot ttgcatcggc cgcgctcccg attccggaag 2959 

tgcttgacat tggggaattc agcgagagoc tgacctattg catctcccgc cgtgcacagg 3019 

gtgtcacgtt gce«,gacctg cctgaaaocg aactgcccgc tgttctgcag ccggtcgcgg 3079 

aggccatgga tgcgatcgct gcggccgatc ttagccagac gagegggttc ggcccattcg 3139 

gaccgcaagg aatcggtcaa tacactacat ggcgtgattt eatatgcgcg attgctgatc 3199 

cccatgtgta tcaotggcaa actgtgatgg acgacaccgt cagtgcgtcc gtcgcgcagg 3259 

ctctcgatga gctgatgctt tgggccgagg actgccccga agtccggcac ctcgtgcacg 3319 

cggatttcgg otccaacaat gtcctgacgg acaatggccg cataacagcg gtcattgact 3379 

ggagcgaggc gatgttcggg gattcocaat acgaggtcgc oaaoatcttc ttctggaggc 3439 

cgtggttggc ttgtatggag cagcagaogo gctacttcga geggaggcat c^iggagcttg 3499 

ceiggatcgcc gcggctccgg gcgtatatge tccgcattgg tettgaecaa ctctatcaga 3559 

gcttggttga cggoaatttc gatgatgcag cttgggcsgca gggtcgatgc gacgcaatcg 3619 

tecgatccgg agccgggact gtcgggogta cacaaatcgc ocgcagaagc gcggocgtet 3679 

ggaccgatgg ctgtgtagaa gtactcgccg atagtggaaa ocgacgcccc agcactcgtc 3739 

cgagggcaaa ggaatagcac gtgctacgag atttogattc cacogccgcc ttctatgaaa 3799 

ggttgggctt cggaatcgtt ttccgggacg ccggctggat gatcctccag cgcggggatc 3859 

tcatgctgga gttcttcgcc caccccaact tgtttattgc agcttataat ggttacaaat 3919 

aaagcaatag oatcacaaat ttcacaaata aagcattttt ttcactgcat tctagttgtg 3979 

gtttgtcoaa actcatcaat gtatcttato atgtctgtat accgtcgacc totagctaga 4039 

gcttggogta atcatggtoa tagctgtttc ctgtgtgaaa ttgttatcog ctcacaatte 4099 

cacaceuicat acgagccgga agcataaagt gtaaagcctg gggtgcctaa tgagbgagct 4159 

aactcacatt aattgegttg egetcactgc ccgctttcca gtcgggaaac ctgtcgtgco 4219 

agctgcatta atgaateggc caacgcgcgg ggagaggegg tttgcgtatt gggcgctctt 4279 

cegcttcctc gctcactgac tcgctgogct cggtcgttcg gctgcggcga gcggtatcag 4339 

ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 4399 

tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 4459 

tccataggct ocgcccccot gacgagcatc acaaaaatcg aogctcaagt cagaggtggc 4519 

gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 4579 

ctcctgttcc gaocotgoog ottaccggat acctgtccgo ctttctccct tcgggaagcg 4639 
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tggcgctttc 


tcaatgctca 


cgctgtaggt 


atctcagttc ggtgtaggtc 


gttcgctcca 


4699 


agctgggctg 


tgtgcacgaa 


ccccccgttc 


agcccgaccg ctgcgcctta 


tccggtaact 


4759 


atcgtcttga 


gtccaacccg 


gtaagacaog 


acttatcgcc actggcagca 


gccactggta 


4819 


acaggattag 


cagagcgagg 


tatgtaggog 


gtgctacaga gttcttgaag 


tggtggccta 


4879 


actacggcta 


cactagaagg 


acagtatttg 


gtatctgcgc tctgctgaag 


ccagttacct 


4939 


tcggaaaaag 


agttggtagc 


tcttgatccg 


gcctaaceiaac caccgctggt 


agcggtggtt 


4999 


tttttgtttg 


caagcagcag 


attaogcgca 


gaaaaaaagg atctcaagaa 


gatcctttga 


5059 


tcttttctac 


ggggtctgac 


gctcagtgga 


acgaetaactc acgttaaggg 


attttggtca 


5119 


tgagattatc 


aaaaaggatc 


ttcacctaga 


tccttttaaa ttaaaaatga 


agttttaciat 


5179 


caatctaaag 


tatatatgag 


taaacttggt 


ctgacagtta ccaatgctta 


atcagtgagg 


5239 


cacctatctc 


agcgatctgt 


ctatttcgtt 


catccatagt tgcctgactc 


cccgtcgtgt 


5299 


agataactac 


gatacgggag 


ggcttaccat 


ctggccccag tgctgcaatg 


ataccgegag 


5359 


accoacgctc 


accggctcca 


gatttatcag 


caataaacca gccagcogga 


agggccgagc 


5419 


gcagaagtgg 


tcctgcaact 


ttatccgcct 


ccatccagtc tattaattgt 


tgccgggaag 


5479 


ctagagtaaer 


tagttcgcca 


gttaatagtt 


tgegcaacgt tgttgccatt 


gctacaggca 


5539 


tcgtgsrtgtc 


aogctcgtcg 


tttggtatgg 


cttcattcag ctecggttcc 


cciacgatcaa 


5599 


ggegagttac 


atgatccccc 


atgttgtgca 


aaaaagoggt tagctccttc 


ggtcctccga 


5659 


tcgttgtcag 


aagtaagttg 


gcogcagtgt 


tatcactcat ggttatggca 


gcactgcata 


5719 


attctcttac 


tgtcatgcca 


tccgtaagat 


gcttttctgt gactggtgag 


tactcaacca 


5779 


agtcattctg 


agaatagtgt 


atgcggcgac 


cgagttgctc ttgcccggcg 


tcaatacggg 


5839 


ataataccgc 


gccacatagc 


agaactttaa 


aagtgctcat cattggaaaa 


cgttcttcgg 


5899 


ggcgaaaact 


ctcaaggatc 


ttaccgctgt 


tgagatccag ttcgatgtaa 


cceactcgtg 


5959 


cacccaactg 


atcttcagca 


tcttttactt 


tcaccagogt ttctgggtga 


gcaaaaacag 


6019 


gaaggcaaaa 


tgccgcaaaa 


aagggaateia 


gggcgacacg ge«iatgttga 


atactcatac 


6079 


tcttcctttt 


tcaatattat 


tgaagcattt 


atcagggtta ttgtcteatg 


agcggataca 


6139 


tatttgaatg 


tatttagaaa 


aataaacaaa 


taggggttcc gcgcacattt 


ccccgaaaag 


6199 


tgccacctga 


cgtc 








6213 
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<210> 8 
<211> 5 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<220> 

<221> MODJRBS 
<222> {5) 
<223> T or S 

<400> 8 

Ala Ser Asn lie Xaa 
1 5 



<210> 9 

<2ia> 6 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD^S 
<222> (6) 
<223> T or S 

<400> 9 

Ser Pro Xle Asn Ala Xas. 
1 5 



<210> 10 
<211> 7 
<212> HRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences Synthetic 
peptide 

<220> 

<221> MODJE?ES 
<222> {7) 
<223> T or S 

<400> 10 

Ala Ser Pro Xle Asn Ala 3Caa 
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<210> ,11 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<220> 

<221> MOD.J4BS 
<222> (4) 
<223> T or S 

<220> 

<221> MODJiBS 
<222> (8) 
<223> T or S 

<400> 11 

Ala Asn lie Xaa Ala Aan lie Xaa Ala Asn He 
15 10 



<210> 12 
<211> 14 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_RES 
<222> (4) 
<223> T or S 

<220> 

<221> MOD_EES 
<222> (9) 
<223> T or S 

<220> 

<221> KODJES 
<222> (14) 
<223> T or S 

<400> 12 

Ala Asn He Xaa Gly Ser Aen He Xaa Gly Ser Asn He Xaa 
15 10 



<210> 13 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<220> 

<:221> MODJRES 
<222> (5) 
<223> T or S 

<220> 

<221> MOD^S 
<222> (9) 
<223> T or S 

<220> 

<221> MOD^S 
<222> (13) 
<223> T or S 

<400> 13 

Ala Ser Asn Ser Xaa Asn Asn Gly Xaa Leu Asn Ala Xaa 
1 5 10 



<210> 14 
<211> 10 
<212> PRT 

<213> Artificial Setjuence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> M0D_JIBS 
<222> (4) 
<223> T or S 

<220> 

<221> MODJRES 
<222> (7) 
<223> T or S 

<220> 

<221> MOD_RES 
<222> (10) 
<223> T or S 

<400> 14 

Ala Asn His Xaa Asn Glu Xaa Asn Ala Xaa 
15 10 



<210> 15 
<211> 7 
<212> PRT 
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<213> Artificial Seguence 
<220> 

<223> Description o£ Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_RES 
<222> (7) 
<223> T or S 

<400> 15 

Gly Ser Fro He Asn Ala Xaa 
1 5 



<210> IS 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_RES 
<222> (7) 
<223> T or S 

<220> 

<221> MOD_JiES 
<222> (13) 
<223> T or S 

<400> 16 

Ala Ser Pro He Asn Ala Xaa Ser Fro He Asn Ala Xaa 
15 10 



<210> 17 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_JlES 
<222> (4) 
<223> T or S 

<220> 

<221> MOD_RES 
<222> (7) 
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<223> T or S 
<220> 

<221> MOD_RES 
<222> (10) 
<223> T or S 

<400> 17 

Ala Asn Asxi Xaa Asn O^yr Xaa Asn Trp Xaa 
15 10 



<210> 18 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Descripticai of Artificial Seopience: Synthatic 
peptide 

<220> 

<221> MODJIES 
<222> (5) 
<223> T or S 

<220> 

<221> M0D_JIES 
<222> (9) 
<223> T or S 

<220> 

<221> MOD_RES 
<222> (12) 
<223> T or S 

<400> 18 

Ala Thr Asn lie Xaa Leu Asn Tyr Xaa Ala Aan Xaa USax 
15 10 



<210> 19 
<211> 13 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Secpience: Synthetic 
peptide 

<220> 

<221> MCB3_RES 
<222> {5} 
<2i23> T or S 

<220> 

<221> MODJRES 
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<222> (9) 
<223> T or S 

<220> 

<221> MOrt-RES 
<222> (13) 
<223> T or S 

<400> 19 

Ala Ala hsn Ser Xaa Oly Asn lie Xaa He Asn Gly Xaa 
1 5 10 



<210> 20 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MODJRES 
<222> (5> 
<223> T or S 

<220> 

<221> MODJIBS 
<222> (9) 
<223> T or S 

<220> 

<221> MOD_RES 
<222> (13) 
<223> T or S 

<400> 20 

Ala Val Asn Trp Zaa Ser Asn Asp Xaa Ser Asn Ser Xaa 
1 5 10 



<210> 21 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_RES 
<222> (5) 
<223> T or S 

<220> 
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<221> MODJRES 
<222> {9) 
<223> T or S 

<220> 

<221> MOD_RES 
<222> {13} 
<223> T or S 

<400> 21 

Ala Val Ash Trp Xeia Ser Asn Asp Xaa Ser Asn Sar Xaa 
IS 10 



<210> 22 
<211> 10 
<212> PRT 

<213> Artificial Setj^ence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> M0D_JIES 
<222> (4) 
<223> T or S 

<220> 

<221> MOELPES 
<222> (7) 
<223> T or S 

<220> 

<221> MOD_RBS 
<222> {10} 

<223> T or S J 
<400> 22 

Ala Asn Asn Xaa Asn lyr Xaa Asn Ser Xaa 
1 5 10 



<210> 23 
<211> 10 
<212> ERT 

<213> Artificial Secpionce 
<220> 

<223> Description of Artificial Sei3V>ence: Synthetic 

peptide 

<400> 23 

Ala Asn Asn Thr Asn Tyr Tlir Asn Trp Thr 
15 10 
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<210> 24 
<211> IS 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Linker 
<400> 24 

Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 
1 5 10 15 



<210> 25 
<211> 35 
<212> im 

<213> Artificial Seq[uence 



<220> 

<223> Description of Artificial Sequence: Primer 
<400> 25 

cgcagatetg atggctggca gcctcacagg attgc 35 



<210> 26 
<211> 37 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Prlinsr 
<400> 26 

ceggaattcc catcactggc gacgccacag gtaggtg 



<210> 27 
<211> 35 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Primer 
<400> 27 

acgcgagctc gcccctgcat ccctaaaagc ttcgg 35 



<210> 28 
<2n> 54 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Primer 
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<400> 28 

gcgttgacgg cagtcagagt tgacagaagg gccagccagc aaaggatagt eatg 



54 



<210> 29 
<211> 62 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description, of Artificial Sequence: Primer 
-:400> 29 

ctagcatgac tatcctttgc tggctggccc ttctgtcaac tctgactgcc gtcaacgeag 60 
ct 62 



<210> 30 
<211> 48 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 



<210> 31 
<211> 56 
<212> DNA 

<213> Artificial SeQiuence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 31 

ctagcatgct gccactttgg actctttcac tgctgctggg agcagtagca ggagct 56 



<210> 32 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 



<400> 30 

cctgctactg ctcccagcag cagtgaaaga gtcceiaagtg gcagcatg 



48 



<400> 32 

cagctggcca tgggtacccg g 



21 



<210> 33 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: N- terminal 
peptide addition 

<400> 33 

Ala Asn lie Thr 

1 



<210> 34 
<211> 7 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> I>escr£ptiaa of Artificial Sequence: N-tennlnal 
peptide addition 

<400> 34 

Ala Ser Pro He Asn Ala O^ir 
1 5 



<210> 35 
<211> 48 
<212> mh 

<213> Artificial Sec[uence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 35 

tgggcatcag crtgccaacat tacagcccgc ccctgcatcc ctaaaagc 48 



<210> 36 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 36 

tttactgttt tcgtaacagt tttg 24 



<210> 37 
<211> 48 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description .Of Artificial Sequence: Primer 



<400> 37 
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gcaggggcgg gctgtaatgt tggcacctga tgcccacgae actgcctg 



<210> 38 
<2ai> 13 
<212> ERT 

<213> Artificial Saguence 
<220> 

<223> Description of Artificial Sequfince: Synthetic 

peptide 

<220> 

<221> 1I0D_RES 
<222> (1)..(13) 

<223> "Xaa" represents a variable amino acid 
<400> 38 

Ala Xaa Asn Xaa Ttir Xaa Asn Xaa Hbx Xaa Asn. Xaa Tbr 
15 10 



<210> 39 
<211> 10 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<22i> mnjusB 

<222> (1)..(10) 

<223> "Xaa* represents a variable amino acid 
<400> 39 

Ala Asn Xaa Thr Asn Xaa Thr Asn Xaa Thr 
15 10 



<210> 40 
<211> 81 
<212> DNA 

<213> Artificial Sequence 
<220> 

<221> modifiedjsase 
<222> (1)..(81) 

<223> "n' represents a, t, c, g, other or unknown 
<220> 

<:223> Description of Artificial Sequence: Primer 
<400> 40 

gtgtcgtggg catcaggtgc cnnsaaydns acbdnsaayd nsachdnsaa ydnsachgcc 60 
cgcccctgca tccctaaaag c ' 81 



wo 02/02597 



PCT/pKOl/00459 



<210> 41 
<211> 27 
<212> DNA. 

<213> Artificial Seguence 



<400> 42 

ggcacctgat gcccacgaca etgcctg 



<210> 43 
<211> 68 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Seg^ienca: Priiner 
<220> 

<221> modifiedjDase 
<222> (1)..{68) 

<223> "imn" is a mixture of trinucleotide codons for all 
natural amino acid residues, eaccept proline 

<400> 43 

cgtgggcatc aggtgccawie nnnaehaayn miachaaynn nachgcccgc ccctgcatcc 60 
ctaaaagc 68 



<210> 44 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Primer 
<400> 44 

gttggcacct gatgcccacg acactgcctg 



<210> 45 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> M0D_JIBS 
<222> (4) 
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<223> variahle andno acid 
<220> 

<221> MOD_RES 
<222> (12) 
<223> P or L 

<400> 45 

Ala I>he Asn Xaa Thr Leu Asn liys ■Thar Tip Asn. Xaa Hhr 
15 10 



<210> 46 
<211> 13 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 46 

Thr Met Asn Asn Thr Trp Asn Trp Thr Trp Asn Txp Thr 
IS 10 



<210> 47 
<211> 13 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 47 

Ala Leu Asn Ser Thr Gly Asn Lau Thr Val Asp Gly Thr 
1 5 10 



<210> 48 
<211> 13 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 4B 

Ala ser Asn Ser Thr Phe Asn Leu Thr Glu Asn Leu Thr 
15 10 



<210> 49 
<211> 12 
<212> ERT 
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<213> artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 49 

Hixc Arg ^Asn Val Thr lie Asn Cya Thr Asn ser Hbr 
15 10 



<210> 50 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 50 

Ala Leu Asn Trp Thr Tyr Asn Gly Thr Lys Asn Val Thr 
1 5 10 . 



<210> 51 
<211> 13 
<212> PRT 

<213> Artificial Se<iuence 
<220> 

<223i> Description of Artificial Sequence: Synthetic 
peptide 

<400> 51 

Ala Ala Aan Trp Thr Val Asn She ^Ehr Gly Asn phe Thr 
15 10 



<210> 52 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<220> 

<221> MODjtES 
<222> {2) 

<223> variable amino acid 
<220> 

<221> MOD_JRES 
<222> (4) 

<223> variable amino acid 
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<400> 52 

Ala Xaa Asn Xaa Tiix Val Asa Ser Thr Asn Val Thr 
15 10 



<210> 53 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> S3 

Ala Asn Asn Phe Thr Phe Aan Gly Hhx lieu Asn Leu Thr 
1 5 10 



<210> 54 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 54 

Ala Gly Asn Trp Thr Ala Asn Val Thr Val Asn Val Thr 
15 10 



<210> 55 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 55 

Ala Gly Asn Ser Tl»r Ser Ash Val Thr Gly Asn Trp Thr 
15 10 



<210> 56 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 
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<400> 56 

Ala Val ASH Ser Thr Met Asn lis His Ala He Pro Pro 
15 10 



<210> 57 
<211> 13 
<212> PRT 

<213> Artificial Sequenee 
<220> 

<223> PBScription of Artificial Seguence: Syntbatic 
peptide 

<400> 57 

Ala Gly Asn Gly Thr Val Asn Gly Hbr He Asn Gly Thr 
15 10 



<210> 58 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MOD_EES 
<222> (8) 

<223> variable amino acid 
<400> 58 

Ala Val Asn Ser Thr Gly Asn Xaa Hir Gly Asn Trp Thr 
1 5 10 



<210> 59 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 59 

Ala Gly Asn Gly Thr Asn Gly Thr Ser. Asn Lsu Thr 
1 5 10 



<210> 60 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequences Synthetic 
peptide 

<400> 60 

Ala Met Asn Ser Thr Lys Asn Ser Thr Leu Asn lie Ttor 
1 5 10 



<210> 61 
<211> 10 
<212> ERT 

<213> Artificial Sequence 
<220> 

■<:223> Description of Artificial Setjuence: Synthetic 
peptide 

<400> 61 

Ala Phe Asn Tyr Thr Ser Lys Asn Ser Thr 
15 10 



<210> 62 
<211> 13 
<212> PRT 
<213> Artificial 



<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 62 

Ala Val Asn Ala lair Mat Asn Trp Thr Ala Asn Gly Thr 
1 5 10 



<210> 63 
<211> 13 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 63 

Ala Ser Asn Ser Thr Asn Asn Gly Thr hem Asn Ala Thr 
15 10 . 



<210> 64 
<211> 13 
<212> PRT 

<213> Artificial Sequence 



<220> 
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<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 64 

Ala Arg Asn Lys Thr l^ys Asn Phe 'Tbx lie Asn Leu Thr 
15 10 



<210> 65 
<211> 12 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 65 

Ala Pro Asn He Thr Asn Asp Thr Val Asn Met Tbr 
1 S 10 



<210> 66 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of ArtificisLl Sequence; Synthetic 
peptide 

<400> 66 

Ala Gin Asn Lys Thr Phe Asn phe Thr Uet Asn cys llir 
15 10 



<210> 67 
<211> 13 
<212> PRT 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Synthetic 

peptide 

<400> 67 

Ala Leu Asn Val Thr Trp Asn Cys Thr Leu Asn Leu Thr 
15 10 



<210> 68 

<211> 10 

<212> PRT 

<213> Artificial i 



<223> Description of Artificial Sequence: Synthetic 



wo 02/02597 



PCT/DKOl/00459 



peptide 
<400> 68 

Ala Lsu Asn Thr Thr Trp Thr Asn Leu Thr 
15 10 



<210> 59 

<211> 10 

<212> PUT 

<213> Artificial 



<223> Description of Artificial Sequence: Synthetic 

peptide 

<400> 69 

Ala Asn Thr Thr Asn Phe Thr Asn Glu Tlhr 
15 10 



<210> 70 
<211> 10 
<212> ERT 

•<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 70 

Ala Asn Trp Thr Asn Arg Hbx Asn pys OJhr 
15 10 



<210> 71 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 71 

Ala Asn Tip Thr Aan Phe Thr Asn Trp Thr 
15 10 



<210> 72 
<211> 10 
<212> PRT 

<213>- Artificisa Sequence 



<220> 

<223> Description of Aartificial Sequence: Synthetic 
peptide 



wo 02/02597 
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<400> 72 

Pro Thr Gly Leu lie Gly Uhr Asn Phe Thr 



<210> 73 
<211> 10 
<212> ERT 

<213> Artificial Sequeace 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 73 

Ala Asn Trp Thx Asn Lys Thr Asn Phe TOir 



<:210> 74 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 74 

Ala Asn Asn Thr Asn Leu Hir Asn Ala max 



<210> 75 
<211> 10 
<212> PRT 

<213> Artificiaa Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 75 

Ala Asn T?yr Thr Asn Trp Thr Asn Phe Thr 



<210> 76 
<211> 10 
<212> PRT 

<213> Artificial Sequeace 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 



wo 02/02597 
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<400> 76 

Ala Asn Thr "Ehr Asn Gin Hoc Asn Asp Vhr 



<210> 77 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 77 

Ala Asn Arg Thr Aan Ttp Thr Asn Vbr Thr 



<210> 78 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequex«3e: Synthetic 
peptide 

<400> 78 

Pro Thr Ala Thr Asn His Thr Asn Ser Thr 



<210> 79 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 73 

Ala Asn Trp Tbr Asn Gin Thr Asn Gin Thr 
1 5 10 



<210> 80 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
pqptide 



<400> 80 



wo 02/02597 
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Ala Asn Trp Thr Asn Trp Thr Asn Ala Hhc 



<210> 81 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 82 

Ala ASH Phe Tbr Asn Lys Thr Asn Met Thr 



<210> 83 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<:220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 



<210> 84 
<211> 10 
<212> PRT 

<213> Artificial Sequence 



peptide 

<220> 
<221> 1 
<222> (3) 
<223> C or W 

<400> 84 

Ala Asn Xaa Thr Asn Ebe T!hr Asn Glu Thr 



<210> 85 
<211> 9 
<212> PRT 

<213> Artificial Sequence 



<220> 



wo 02/02597 



<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 85 

Ala Asn lieu Asp Lys Leu His Lys His 
1 5 



<210> 85 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 86 

Ala Asn qys Phs Tta: Asn Gin Thr Asn Phe Xhr 
15 10 



<210> 87 
<211> 11 
<212> BRT 

<213> Artificial Sequence 
<;220> 

<223> Description of Artificied Sequence: Synthetic 
peptide 

<400> 87 

Ala Asn Trp Tlir Asn Trp Thr Asn Glu Trp Thr 
15 10 



<21D> 88 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 88 

Ala Asn Cys Thr Asn ^crp Thr Asn Cys Thr 
1 5 10 



<210> 89 
<211> 10 
<212> PRT 

<213> Artificial Sequence 



<22D> 

<223> Description of Artificial Sequence: Synthetic 



wo 02/02597 
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peptide 
<400> 89 

Cvs His Pro Tyr Asn Trp O^ir Asn Trp Thr 
-1 5 10 



<210> 90 
<211> 10 
<212> PRT 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Synthetic 

peptide 

<400> 90 

Ala Asn Glu Thr Asn Tyr Thr Asn Glu Thr 
15 10 



<210> 91 
<211> 7 
<212> PRT 

<213> Artificial Sequence 



<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 91 

Ala Asn Trp Thr Asn Tcp Thr 
1 5 



<210> 92 
<211> 10 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 92 

Ala Lys Pro Tyr Lys Ser Tyr Lys She Tyr 



<210> 93 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 



wo 02/02597 
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<400> 93 

Ala Asn lie Thr Asn lys Thr Asn Trp Thr 
15 10 



<210> 94 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 94 

Ala Asn Trp Thr Asn Met Thr Asn He ISnr 
15 10 



<210> 95 
<211> 10 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Bathetic 
peptide 

<400> 95 

Ala Asn Asn Thr Tisn Arg Thr Asn Ehe Thr 
15 10 



<210> 96 
<211> 10 
<212> PUT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 96 

Ala Asn Trp Thr Asn Trp Thr Asn Trp Vtr 
15 10 



<210> 97 

<211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: S^thetic 
peptide 



wo 02/02597 



<400> 97 

Ala Asn Trp Arg Thr Ran His Thr Ran Lys Tbr 
15 10 



<210> 98 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence! Synthetic 
peptide 

<400> 98 

Ala Asn Gin Thr Asn Xle 1^ Asn Trp Hbx 
15 10 



<210> 99 
<211> 11 
<212> PKT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Slynthetic 
peptide 

<400> 99 

Ala Asn. Phe lAix Asn Val Ala ^l!br Asn Gin Vbr 
15 10 



<210> 100 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> HODJES 
<222> (1) 

<223> most probable amino acid 
<220> 

<221> M0DJ5ES 
<222> (2) 

<223> most probable amino acid 

<220> 

<221> MOD_RES 
<222> (5) 

<223> variable amino acid 
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<220> 

<221> MOD_RBS 
<222> O) 

<223> most probable amino acid 
<400> 100 

Ala Asn Thr Thr Xaa Leu Thr Asn Lys Thr 
15 10 



<210> 101 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> MODJtlBS 
<222> <6) 
<223> S or C 

<400> 101 

Ala Asn Lys Tlir Asn Xaa "Phr Asn lie Thr 
15 10 



<210> 102 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<220> 

<221> MOD_RES 
<222> O) 

<223> most proibable amino acid 
<400> 102 

Ala Asn Trp Thr Asn Qys Thr Asn He Thr 
15 10 



<210> 103 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 



wo 02/02597 



<220> 

<221> MOD_pES 
<222> (6) 
<223> F or L 

<400> 103 

Ala Ash Trp Thr Asn Xaa Thx Asn Trp Thr 
15 10 



<210> 104 
<211> 10 
<212> SRT 

<213> Artificial Serjuencfe 
<220> 

<223> Description of Artificial Sequence: Sfynthetic 
peptide 

<400> 104 

CyB Gin Leu Asp Arg Sex Thr Asn Glu Thcr 
15 10 



<210> 105 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 105 

Ala Asn Asn Thr Asn Tyr ihr Asn Trp Thr 
15 10 



<210> 106 
<211> 10 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

'c400> 106 

Ala Asn Asn Thr Asn Tyr Thr Asn Trp Thr 
15 10 



<210> 107 
<211> 12 
<212> PRT 

<213> Artificial Sequence 



wo 02/02597 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 107 

Ma Ala Asn Asp Thr Asn Trp Hbr Val Asn Cys Olir 
1 5 10 



<210> 108 
<211> 13 
<212> PUT 

<213> Artificial Seqaence 
<220> 

<223> Description of Artificial Sequence: Syathatic 
peptide 

<400> 108 

Ala Vhr Asn lie Thr hea Aan ^Cyr Thr Ala Asn Thr Thr 
1 5 10 



<210> 109 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 109 

Ala Ala Asn Ser Thr Qly Asn He Thr He Asn Gly Thr 
15 10 



<210> 110 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 110 

Ala Val Asn Trp fEhr Ser Asn Asp Tbr Ser Asn Ser Thr 
15 10 . 



<210> 111 
<211> 13 
<212> PRT 

<213> Artlficiea Sequence 
<220> 
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<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 111 

Ala Ser Fro lie Asn Ala Thr Ser Fro lie Asn Ala thi 
15 10 



<210> 112 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Iiinker 

<400> 112 
Gly Gly Gly Gly 
1 



<210> 113 
<211> 4 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; Liolcer 

<400> 113 

Gly Asn Ala Tbx 



<210> 114 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 114 

Asn Ser I'hr Gin Asn Ala Thr Ala 
1 5 

<210> 115 
<211> 14 
<212> ERT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 

peptide 

<400> 115 

Ala Asn I«u TOir Val Arg Asn Leu Thr Arg Asn val Thr Val 
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47 



